There are six kinds of derivatives that can be expressed as matrices:
The partials with respect to the numerator are laid out according to the shape ofY while the partials with respect to the denominator are laid out according to the transpose ofX. For example, dy/dx is a column vector while dy/dx is a row vector (assumingx and y are column vectors—otherwise it is flipped). Each of these derivatives can be tediously computed via partials, but this section shows how they instead can be computed with matrix manipulations.
Assuming x and y are column vectors, A, X, Y are Matrix
This is where the operators and identities developed in the following sections are useful. For example, since the derivative ofY with respect to X cannot be represented by a matrix, it is customary to use dvec(Y)/dvec(X) instead (vec is defined below). If the purpose of differentiation is to equate the derivative to zero, then this transformation doesn’t affect the result.
引用于Old and New Matrix Algebra Useful for Statistics Thomas P. Minka December 28, 2000
Scalar y | Vector y (size m) | Matrix Y (size m×n) | ||||
---|---|---|---|---|---|---|
Notation | Type | Notation | Type | Notation | Type | |
Scalar x | ![]() | scalar | ![]() | (numerator layout) size-m column vector (denominator layout) size-m row vector | ![]() | (numerator layout) m×nmatrix |
Vector x (size n) | ![]() | (numerator layout) size-n row vector (denominator layout) size-n column vector | ![]() | (numerator layout) m×n matrix (denominator layout) n×m matrix | ![]() | ? |
Matrix X (sizep×q) | ![]() | (numerator layout) q×p matrix (denominator layout) p×q matrix | ![]() | ? | ![]() | ? |
引用于https://en.wikipedia.org/wiki/Matrix_calculus#Layout_conventions
一般我们采用numerator layout