Covariance

Given sets of variates denoted $\{x_1\}$ , ..., $\{x_n\}$ , a quantity called the Covariance Matrix is defined by

$\displaystyle V_{ij}$	$\textstyle =$	$\displaystyle \mathop{\rm cov}\nolimits (x_i,x_j)$	(1)
	$\textstyle \equiv$	$\displaystyle \left\langle{(x_i-\mu_i)(x_j-\mu_j)}\right\rangle{}$	(2)
	$\textstyle =$	$\displaystyle \left\langle{x_ix_j}\right\rangle{}-\left\langle{x_i}\right\rangle{}\left\langle{x_j}\right\rangle{},$	(3)

where $\mu_i=\left\langle{x_i}\right\rangle{}$ and $\mu_j=\left\langle{x_j}\right\rangle{}$ are the Means of

and

, respectively. An individual element $V_{ij}$ of the Covariance Matrix is called the covariance of the two variates

and

, and provides a measure of how strongly correlated these variables are. In fact, the derived quantity

$\begin{displaymath} \mathop{\rm cor}\nolimits (x_i,x_j)\equiv {\mathop{\rm cov}\nolimits (x_i,x_j)\over\sigma_i\sigma_j}, \end{displaymath}$

(4)

where $\sigma_i$ , $\sigma_j$ are the Standard Deviations, is called the Correlation of

and

. Note that if

and

are taken from the same set of variates (say,

), then

$\begin{displaymath} \mathop{\rm cov}\nolimits (x,x) = \left\langle{x^2}\right\ra... ...left\langle{x}\right\rangle{}^2=\mathop{\rm var}\nolimits (x), \end{displaymath}$

(5)

giving the usual Variance $\mathop{\rm var}\nolimits (x)$ . The covariance is also symmetric since

$\begin{displaymath} \mathop{\rm cov}\nolimits (x,y) = \mathop{\rm cov}\nolimits (y,x). \end{displaymath}$

(6)

For two variables, the covariance is related to the Variance by

$\begin{displaymath} \mathop{\rm var}\nolimits (x+y) = \mathop{\rm var}\nolimits ... ...mathop{\rm var}\nolimits (y)+2\mathop{\rm cov}\nolimits (x,y). \end{displaymath}$

(7)

For two independent variates and ,

$\begin{displaymath} \mathop{\rm cov}\nolimits (x,y)=\left\langle{xy}\right\rangl... ...{x}\right\rangle{}\left\langle{y}\right\rangle{}-\mu_x\mu_y=0, \end{displaymath}$

(8)

so the covariance is zero. However, if the variables are correlated in some way, then their covariance will be Nonzero. In fact, if $\mathop{\rm cov}\nolimits (x,y) > 0$ , then

tends to increase as

increases. If $\mathop{\rm cov}\nolimits (x,y) < 0$ , then

tends to decrease as

increases.

The covariance obeys the identity

$\displaystyle \mathop{\rm cov}\nolimits (x+z,y)$	$\textstyle =$	$\displaystyle \left\langle{(x+z)y-\left\langle{x+z}\right\rangle{}\left\langle{y}\right\rangle{}}\right\rangle{}$
	$\textstyle =$	$\displaystyle \left\langle{xy}\right\rangle{}+\langle zy\rangle -(\left\langle{x}\right\rangle{} +\left\langle{z}\right\rangle{})\left\langle{y}\right\rangle{}$
	$\textstyle =$	$\displaystyle \left\langle{xy}\right\rangle{}-\left\langle{x}\right\rangle{}\le... ...zy}\right\rangle{}-\left\langle{z}\right\rangle{}\left\langle{y}\right\rangle{}$
	$\textstyle =$	$\displaystyle \mathop{\rm cov}\nolimits (x,y)+\mathop{\rm cov}\nolimits (z,y).$	(9)

By induction, it therefore follows that

$\displaystyle \mathop{\rm cov}\nolimits \left({\sum_{i=1}^n x_i,y}\right)$	$\textstyle =$	$\displaystyle \sum_{i=1}^n \mathop{\rm cov}\nolimits (x_i,y)$	(10)
$\displaystyle \mathop{\rm cov}\nolimits \left({\sum_{i=1}^n x_i, \sum_{j=1}^m y_j}\right)$	$\textstyle =$	$\displaystyle \sum_{i=1}^n \mathop{\rm cov}\nolimits \left({x_i, \sum_{j=1}^m y_j}\right)$	(11)
	$\textstyle =$	$\displaystyle \sum_{i=1}^n \mathop{\rm cov}\nolimits \left({\sum_{j=1}^m y_j,x_i}\right)$	(12)
	$\textstyle =$	$\displaystyle \sum_{i=1}^n \sum_{j=1}^m \mathop{\rm cov}\nolimits (y_j,x_i)$	(13)
	$\textstyle =$	$\displaystyle \sum_{i=1}^n \sum_{j=1}^m \mathop{\rm cov}\nolimits (x_i, y_j).$	(14)