Martin Shkreli

	A	B	C	E
1	Main
2		Resources
3		https://course.fast.ai/Resources/book.html
4		https://github.com/fastai/fastbook		https://www.youtube.com/playlist?list=PL_iWQOsE6TfVmKkQHucjPAoRtIJYt8a5A
5		Glossary
6		addition	addition is defined for matrices with the same shape (dimension size)
7		associativity	matrix multiplication is associative
8		broadcasting	deep learning convention of adding a vector repeatedly to a matrix
9		commutivity	matrix multiplication is not commutative
10		determinant
11		distributive	matrix multiplication is distributive
12		dot product	matrix product of two equal dimension vectors
13		element-wise product	also known as the Hadamard product, simple multiplication of individual elements of matrices
14		identity matrix	the identity matrix is a matrix that does not change any vector when we multiply that vector by that matrix. All main diagnoal entries in an identity matrix are 1, and all other values are zero
15		linear combination
16		matrix	a 2D array of numbers
17		matrix inverse	the matrix inverse of A, A^-1, is defined as A(A^-1) = I_n
18		matrix product	defined as the sum of A_i,k * B_k,j for all k
19		multiplication	multiplication is defined for a matrix and a scalar, multiplying two matrices is only defined for A_i,j if B_j,k, with product C_i,k. see matrix product.
20		scalar	mathematical object, often a real/integer in a single dimension
21		span
22		stochastic differential equation (SDE)
23		tensor	a multi-dimensional array of numbers
24		transpose	mirror image of a matrix across its main diagnoal, defined as A(T)_i,j = A_j,i. the transpose of (AB) is B^t*A^t
25		vector	a one-dimensional array of numbers
26
27
28
29
30
31		Forward diffusion process where X_t is sample condition on X_t-1 using a Gaussian distribution with mean ((1 - B_t)^0.5)*x_t-1 and variance B_t
32		B_t is usually predefined and fixed, T is total number of diffusion steps.
33
34
35
36		Reverse generative process (denoising) has denoising distribution of P_theta*(x_t-1 \| x_t), a Gaussian whose mean is
37		defined using a trainable neural network mu_theta(x_t, t), with variance preset
38
39
40
41
42		example of fixed forward SDE, transforms sample into noise