Covariance and Correlation
Demystifying the terms
Covariance indicates the direction of the linear relationship between variables.
Correlation on the other hand measures both the strength and direction of the linear relationship between two variables.
Correlation is a function of the covariance. What sets them apart is the fact that correlation values are standardized whereas, covariance values are not.
Defining the terms mathematically
Covariance
cov(x,y)=E[(x−μx)(y−μy)]=E[xy]−E[x]E[y] \begin{aligned} cov(x,y) &= E[(x - \mu_x) (y - \mu_y)]\\ &= E[xy] - E[x] E[y] \end{aligned} cov(x,y)=E[(x−μx)(y−μy)]=E[xy]−E[x]E[y]
If we have only a single variable xxx, then
cov(x,x)=E[(x−μx)(x−μx)]=E[(x−μx)2]=var(x)=σ2(x)=σx2Let var(x):=s2sampled varaince \begin{aligned} cov(x, x) &= E[(x - \mu_x) (x - \mu_x)]\\ &= E[(x - \mu_x)^2] \\ &= var(x) = \sigma^2(x) = \sigma^2_x \\ \text{Let }var(x) & := s^2 \hspace{1cm} \text{sampled varaince} \end{aligned} cov(x,x)Let var(x)=E[(x−μx)(x−μx)]=E[(x−μx)2]=var(x)=σ2(x)=σx2:=s2sampled varaince
Expand it, we can get
s2=cov(x,x)=∑i=1N(xi−xˉ)2n−1cov(x,y)=∑i=1N(xi−xˉ)(yi−yˉ)n−1 \begin{aligned} s^2 = cov(x, x) &= \frac{\sum_{i=1}^N (x_i - \bar{x})^2}{n-1} \\ cov(x,y) &= \frac{\sum_{i=1}^{N}(x_i - \bar{x}) (y_i - \bar{y})}{n-1} \end{aligned} s2=cov(x,x)cov(x,y)=n−1∑i=1N(xi−xˉ)2=n−1∑i=1N(xi−xˉ)(yi−yˉ)
The numerator of the first equation is called sum of squared deviation, and the second is called sum of cross product.
Correlation
corr(x,y)=cov(x,y)sxsy=E[(x−μx)(y−μy)]sxsy=E[(x−μx)(y−μy)]σxσy \begin{aligned} corr(x,y) = \frac{cov(x,y)}{s_x s_y} &= \frac{E[(x - \mu_x) (y - \mu_y)]}{s_x s_y} \\ &= \frac{E[(x - \mu_x) (y - \mu_y)]}{\sigma_x \sigma_y} \end{aligned} corr(x,y)=sxsycov(x,y)=sxsyE[(x−μx)(y−μy)]=σxσyE[(x−μx)(y−μy)]
So the values of correlation coefficient rnge from [-1, 1]. The positive sign signifies the direction of the correlation i.e. if one of the variables increases, the other variable is also supposed to increase.
Data-matrix representation of covariance and correlation
X=[x11...x1n.........xm1...xmn]=[x1...xn] X = \begin{bmatrix} x_{11} & ... & x_{1n} \\ ... & ... & ... \\ x_{m1} & ... & x_{mn} \\ \end{bmatrix} = \begin{bmatrix} \mathbf{x}_1 & ... & \mathbf{x}_n \end{bmatrix} X=⎣⎡x11...xm1.........x1n...xmn⎦⎤=[x1...xn]
order of X=m×nX = m\times nX=m×n
We call a row is item / subject and a column variable
Now we can calculate the sample mean of jjjth variable
xˉj=1m∑i=1mxij \bar{x}_j = \frac{1}{m}\sum_{i=1}^m x_{ij} xˉj=m1i=1∑mxij
similarly, the row-mean is
xˉi=1n∑j=1nxij \bar{x}_i = \frac{1}{n}\sum_{j=1}^nx_{ij} xˉi=n1j=1∑nxij
We then can define the covariance matrix:
S=1m[x1−xˉ1...xn−xˉn][x1−xˉ1...xn−xˉn]=[s12...s1n2.........sn12...sn2]where sj2=1m∑i=1m(xij−xˉj)2variance of jth variablesjk=1m∑i=1m(xij−xˉj)(xik−xˉk)covariance between jth and kth variablexˉj=1m∑i=1mxijmean of jth variable \begin{aligned} S = \frac{1}{m}\begin{bmatrix} \mathbf{x}_1 - \bar{\mathbf{x}}_1 \\ ... \\ \mathbf{x}_n - \bar{\mathbf{x}}_n \\ \end{bmatrix} \begin{bmatrix} \mathbf{x}_1 - \bar{\mathbf{x}}_1 & ... & \mathbf{x}_n - \bar{\mathbf{x}}_n \end{bmatrix} &= \begin{bmatrix} s_{1}^2 & ... & s_{1n}^2 \\ ... & ... & ... \\ s_{n1}^2 & ... & s_{n}^2 \\ \end{bmatrix}\\ \text{where } s_j^2 &= \frac{1}{m}\sum_{i=1}^{m}(x_{ij} - \bar{x}_j)^2 \hspace{1cm} \text{variance of jth variable} \\ s_{jk} &= \frac{1}{m} \sum_{i=1}^{m}(x_{ij} - \bar{x}_j) (x_{ik} - \bar{x}_k) \hspace{1cm} \text{covariance between jth and kth variable}\\ \bar{\mathbf{x}}_j &= \frac{1}{m}\sum_{i=1}^{m}x_{ij} \hspace{1cm} \text{mean of jth variable} \end{aligned} S=m1⎣⎡x1−xˉ1...xn−xˉn⎦⎤[x1−xˉ1...xn−xˉn]where sj2sjkxˉj=⎣⎡s12...sn12.........s1n2...sn2⎦⎤=m1i=1∑m(xij−xˉj)2variance of jth variable=m1i=1∑m(xij−xˉj)(xik−xˉk)covariance between jth and kth variable=m1i=1∑mxijmean of jth variable
We can see that the covariance matrix is a n×nn\times nn×n symmetric matrix
Then we can define the Correlation matrix
R=1m[(x1−xˉ1)/s1...(xn−xˉn)/sn][(x1−xˉ1)/s1...(xn−xˉn)/sn]=[1r12...r1n............rn1......1] \begin{aligned} R &= \frac{1}{m} \begin{bmatrix} (\mathbf{x}_1 - \bar{\mathbf{x}}_1) / s_1 \\ ... \\ (\mathbf{x}_n - \bar{\mathbf{x}}_n) / s_n \\ \end{bmatrix} \begin{bmatrix} (\mathbf{x}_1 - \bar{\mathbf{x}}_1) / s_1 & ... & (\mathbf{x}_n - \bar{\mathbf{x}}_n) / s_n \\ \end{bmatrix}\\ &= \begin{bmatrix} 1 & r_{12} & ... & r_{1n} \\ ...& ... & ... & ... \\ r_{n1} & ... & ... & 1 \end{bmatrix} \end{aligned} R=m1⎣⎡(x1−xˉ1)/s1...(xn−xˉn)/sn⎦⎤[(x1−xˉ1)/s1...(xn−xˉn)/sn]=⎣⎡1...rn1r12...............r1n...1⎦⎤
Covariance versus Correlation
-
Covariance has unit from the product of the units of the two variables
Correlation is dimensionless -
Covariance can take value from (−∞,+∞)(-\infty, +\infty)(−∞,+∞)
Correlation lies between [−1,1][-1, 1][−1,1]