info prev up next book cdrom email home


For $N$ samples of a variate having a distribution with known Mean $\mu$, the ``population variance'' (usually called ``variance'' for short, although the word ``population'' should be added when needed to distinguish it from the Sample Variance) is defined by

$\displaystyle \mathop{\rm var}\nolimits (x)$ $\textstyle \equiv$ $\displaystyle {1\over N} \sum(x-\mu)^2 = \left\langle{x^2-2\mu x+\mu^2}\right\rangle{}$  
  $\textstyle =$ $\displaystyle \left\langle{x^2}\right\rangle{} -\left\langle{2\mu x}\right\rangle{}+\left\langle{\mu^2}\right\rangle{}$  
  $\textstyle =$ $\displaystyle \left\langle{x^2}\right\rangle{}-2\mu\left\langle{x}\right\rangle{}+\mu^2,$ (1)

\left\langle{x}\right\rangle{}\equiv {1\over N} \sum_{i=1}^N x_i.
\end{displaymath} (2)

But since $\left\langle{x}\right\rangle{}$ is an Unbiased Estimator for the Mean
\mu \equiv \left\langle{x}\right\rangle{},
\end{displaymath} (3)

it follows that the variance
\sigma^2\equiv\mathop{\rm var}\nolimits (x) = \left\langle{x^2}\right\rangle{}-\mu^2.
\end{displaymath} (4)

The population Standard Deviation is then defined as
\sigma \equiv \sqrt{\mathop{\rm var}\nolimits (x)} =\sqrt{\left\langle{x^2}\right\rangle{}-\mu^2}.
\end{displaymath} (5)

A useful identity involving the variance is
\mathop{\rm var}\nolimits (f(x)+g(x)) = \mathop{\rm var}\nolimits (f(x))+\mathop{\rm var}\nolimits (g(x)).
\end{displaymath} (6)

$\displaystyle \mathop{\rm var}\nolimits (ax+b)$ $\textstyle =$ $\displaystyle \left\langle{[(ax+b)-\left\langle{ax+b}\right\rangle{} ]^2}\right\rangle{}$  
  $\textstyle =$ $\displaystyle \left\langle{(ax+b-a\left\langle{x}\right\rangle{} -b)^2}\right\rangle{}$  
  $\textstyle =$ $\displaystyle \left\langle{(ax-a\mu)^2}\right\rangle{} = \left\langle{a^2(x-\mu)^2}\right\rangle{}$  
  $\textstyle =$ $\displaystyle a^2\left\langle{(x-\mu)^2}\right\rangle{} = a^2\mathop{\rm var}\nolimits (x)$ (7)
$\displaystyle \mathop{\rm var}\nolimits (b)$ $\textstyle =$ $\displaystyle 0.$ (8)

If the population Mean is not known, using the sample mean $\bar x$ instead of the population mean $\mu$ to compute

s^2\equiv \hat\sigma^2_N \equiv {1\over N} \sum_{i=1}^N (x_i-\bar x)^2
\end{displaymath} (9)

gives a Biased Estimator of the population variance. In such cases, it is appropriate to use a Student's t-Distribution instead of a Gaussian Distribution. However, it turns out (as discussed below) that an Unbiased Estimator for the population variance is given by
s'^2\equiv \hat\sigma'^2_{N}\equiv {1\over N-1} \sum_{i=1}^N (x_i-\bar x)^2.
\end{displaymath} (10)

The Mean and Variance of the sample standard deviation for a distribution with population mean $\mu$ and Variance are

$\displaystyle \mu_{{s_N}^2}$ $\textstyle =$ $\displaystyle {N-1\over N} s^2$ (11)
$\displaystyle {\sigma_{{s_N}^2}}^2$ $\textstyle =$ $\displaystyle {N-1\over N^3} [(N-1)\mu_4-(N-3){\mu_2}^2].$ (12)

The quantity $N{s_N}^2/\sigma^2$ has a Chi-Squared Distribution.

For multiple variables, the variance is given using the definition of Covariance,

$\displaystyle \mathop{\rm var}\nolimits \left({ \sum_{i=1}^n x_i}\right)$ $\textstyle =$ $\displaystyle \mathop{\rm cov}\nolimits \left({\,\sum_{i=1}^n x_i, \sum_{j=1}^m x_j}\right)$  
  $\textstyle =$ $\displaystyle \sum_{i=1}^n \sum_{j=1}^m \mathop{\rm cov}\nolimits (x_i,x_j)$  
  $\textstyle =$ $\displaystyle \sum_{i=1}^n \sum_{j=1 \atop j=i}^m \mathop{\rm cov}\nolimits (x_...
... + \sum_{i=1}^n \sum_{j=1 \atop j\not= i}^m \mathop{\rm cov}\nolimits (x_i,x_j)$  
  $\textstyle =$ $\displaystyle \sum_{i=1}^n \mathop{\rm cov}\nolimits (x_i,x_j) + \sum_{i=1}^n \sum_{j=1\atop j\not = i}^m \mathop{\rm cov}\nolimits (x_i,x_j)$  
  $\textstyle =$ $\displaystyle \sum_{i=1}^n \mathop{\rm var}\nolimits (x_i) + 2\sum_{i=1}^n \sum_{j=i+1}^m \mathop{\rm cov}\nolimits (x_i,x_j).$  

A linear sum has a similar form:
$\mathop{\rm var}\nolimits \left({\,\sum_{i=1}^n a_ix_i}\right)= \mathop{\rm cov}\nolimits \left({\sum_{i=1}^n a_ix_i, \sum_{j=1}^m a_jx_j}\right)$
$ = \sum_{i=1}^n \sum_{j=1}^m a_ia_j\mathop{\rm cov}\nolimits (x_i,x_j)$
$ = \sum_{i=1}^n {a_i}^2\mathop{\rm var}\nolimits (x_i) + 2\sum_{i=1}^n \sum_{j=i+1}^m a_ia_j\mathop{\rm cov}\nolimits (x_i,x_j).\quad$ (14)
These equations can be expressed using the Covariance Matrix.

To estimate the population Variance from a sample of $N$ elements with a priori unknown Mean (i.e., the Mean is estimated from the sample itself), we need an Unbiased Estimator for $\sigma$. This is given by the k-Statistic $k_2$, where

k_2 = {N\over N-1} m_2
\end{displaymath} (15)

and $m_2\equiv s^2$ is the Sample Variance
s^2\equiv {1\over N}\sum_{i=1}^N (x_i-\bar x)^2.
\end{displaymath} (16)

Note that some authors prefer the definition
s'^2\equiv {1\over N-1}\sum_{i=1}^N (x_i-\bar x)^2,
\end{displaymath} (17)

since this makes the sample variance an Unbiased Estimator for the population variance.

When computing numerically, the Mean must be computed before $s^2$ can be determined. This requires storing the set of sample values. It is possible to calculate $s'^2$ using a recursion relationship involving only the last sample as follows. Here, use $\mu_j$ to denote $\mu$ calculated from the first $j$ samples (not the $j$th Moment)

\mu_j\equiv {\,\sum_{i=1}^j x_i\over j},
\end{displaymath} (18)

and ${s_j}^2$ denotes the value for the sample variance $s'^2$ calculated from the first $j$ samples. The first few values calculated for the Mean are
$\displaystyle \mu_1$ $\textstyle =$ $\displaystyle x_1$ (19)
$\displaystyle \mu_2$ $\textstyle =$ $\displaystyle {1\cdot \mu_1+x_2\over 2}$ (20)
$\displaystyle \mu_3$ $\textstyle =$ $\displaystyle {2\mu_2+x_3\over 3}.$ (21)

Therefore, for $j=2$, 3 it is true that
\mu_j={(j-1)\mu_{j-1}+x_j\over j}.
\end{displaymath} (22)

Therefore, by induction,
$\displaystyle \mu_{j+1}$ $\textstyle =$ $\displaystyle {[(j+1)-1]\mu_{(j+1)-1}+x_{j+1}\over j+1}$  
  $\textstyle =$ $\displaystyle {j\mu_j+x_{j+1}\over j+1}$ (23)
$\displaystyle \mu_{j+1}(j+1)$ $\textstyle =$ $\displaystyle (j+1)\mu_j+(x_{j+1}-\mu_j)$ (24)
$\displaystyle \mu_{j+1}$ $\textstyle =$ $\displaystyle \mu_j+{x_{j+1}-\mu_j\over j+1},$ (25)

{s_j}^2 = {\sum_{i=1}^j (x_i-\mu_j)^2\over j-1}
\end{displaymath} (26)

for $j\geq 2$, so

$\displaystyle j{s_{j+1}}^2$ $\textstyle =$ $\displaystyle j {\sum_{i=1}^{j+1} (x_i-\mu_{j+1})^2\over j}
= \sum_{i=1}^{j+1} (x_i-\mu_{j+1})^2$  
  $\textstyle =$ $\displaystyle \sum_{i=1}^{j+1} [(x_i-\mu_j)(\mu_j-\mu_{j+1})]^2$  
  $\textstyle =$ $\displaystyle \sum_{i=1}^{j+1} (x_i-\mu_j)^2+\sum_{i=1}^{j+1} (\mu_j-\mu_{j+1})^2+2\sum_{i=1}^{j+1}(x_i-\mu_j)(\mu_j-\mu_{j+1}).$ (27)

Working on the first term,
$\displaystyle \sum_{i=1}^{j+1} (x_i-\mu_j)^2$ $\textstyle =$ $\displaystyle \sum_{i=1}^j (x_i-\mu_j)^2+(x_{j+1}-\mu_j)^2$  
  $\textstyle =$ $\displaystyle (j-1){s_j}^2+(x_{j+1}-\mu_j)^2.$ (28)

Use (24) to write
\end{displaymath} (29)

\sum_{i=1}^{j+1} (x_i-\mu_j)^2=(j-1){s_j}^2+(j+1)^2(\mu_{j+1}-\mu_j)^2.
\end{displaymath} (30)

Now work on the second term in (27),
\sum_{i=1}^{j+1} (\mu_j-\mu_{j+1})^2 = (j+1)(\mu_j-\mu_{j+1})^2.
\end{displaymath} (31)

Considering the third term in (27),
$\sum_{i=1}^{j+1} (x_i-\mu_j)(\mu_j-\mu_{j+1}) = (\mu_j-\mu_{j+1}) \sum_{i=1}^{j+1} (x_i-\mu_j)$
$ = (\mu_j-\mu_{j+1}) \left[{\sum_{i=1}^j (x_i-\mu_j)+(x_{j+1}-\mu_j)}\right]$
$ = (\mu_j-\mu_{j+1})\left({x_{j+1}-\mu_j-j\mu_j +\sum_{i=1}^j x_i}\right).\quad$ (32)
\sum_{i=1}^j x_i=j\mu_j,
\end{displaymath} (33)

$\sum_{i=1}^{j+1} (\mu_j-\mu_{j+1})(x_{j+1}-\mu_j)$
$=\sum_{i=1}^{j+1} (\mu_j-\mu_{j+1}) (j+1)(\mu_{j+1}-\mu_j)$
$ = -(j+1)(\mu_j-\mu_{j+1})^2.\quad$ (34)
Plugging (30), (31), and (34) into (27),
$\displaystyle j{s_{j+1}}^2$ $\textstyle =$ $\displaystyle [(j-1){s_j}^2+(j+1)^2(\mu_{j+1}-\mu_j)^2]$  
  $\textstyle \phantom{=}$ $\displaystyle +[(j+1)(\mu_j-\mu_{j+1}) +2[-(j+1)(\mu_j-\mu_{j+1})]$  
  $\textstyle =$ $\displaystyle (j-1){s_j}^2+(j+1)^2(\mu_{j+1}-\mu_j)^2$  
  $\textstyle \phantom{=}$ $\displaystyle -(j+1)(\mu_j-\mu_{j+1})^2$  
  $\textstyle =$ $\displaystyle (j-1){s_j}^2+(j+1)[(j+1)-1](\mu_{j+1}-\mu_j)^2$  
  $\textstyle =$ $\displaystyle (j-1){s_j}^2+j(j+1)(\mu_{j+1}-\mu_j)^2,$ (35)

{s_{j+1}}^2 = \left({1-{1\over j}}\right){s_j}^2+(j+1)(\mu_{j+1}-\mu_j)^2.
\end{displaymath} (36)

To find the variance of $s^2$ itself, remember that

\mathop{\rm var}\nolimits (s^2)\equiv\left\langle{s^4}\right\rangle{}-\left\langle{s^2}\right\rangle{}^2,
\end{displaymath} (37)

\left\langle{s^2}\right\rangle{}={N-1\over N}\mu_2.
\end{displaymath} (38)

Now find $\left\langle{s^4}\right\rangle{}$.

$\displaystyle \left\langle{s^4}\right\rangle{}$ $\textstyle =$ $\displaystyle \left\langle{(s^2)^2}\right\rangle{} = \left\langle{(\left\langle{x^2}\right\rangle{}-\left\langle{x}\right\rangle{}^2)^2}\right\rangle{}$  
  $\textstyle =$ $\displaystyle \left\langle{\left[{{1\over N}\sum {x_i}^2-\left({{1\over N} \sum x_i}\right)^2}\right]^2}\right\rangle{}$  
  $\textstyle =$ $\displaystyle {1\over N^2} \left\langle{\left({\sum x_i}\right)^2}\right\rangle...
...t\rangle{}+{1\over N^4} \left\langle{\left({\sum x_i}\right)^4}\right\rangle{}.$ (39)

Working on the first term of (39),
$\displaystyle \left\langle{\left({\,\sum {x_i}^2}\right)^2}\right\rangle{}$ $\textstyle =$ $\displaystyle \left\langle{\sum {x_i}^4+\sum {x_i}^2{x_j}^2}\right\rangle{}$  
  $\textstyle =$ $\displaystyle \left\langle{\sum {x_i}^4}\right\rangle{} +\left\langle{\sum{x_i}^2{x_j}^2}\right\rangle{}$  
  $\textstyle =$ $\displaystyle N\left\langle{{x_i}^4}\right\rangle{} +N(N-1)\left\langle{{x_i}^2}\right\rangle{}\left\langle{{x_j}^2}\right\rangle{}$  
  $\textstyle =$ $\displaystyle N\mu'_4+N(N-1){\mu'_2}^2.$ (40)

The second term of (39) is known from k-Statistic,
\left\langle{\sum {x_i}^2\left({\sum x_j}\right)^2}\right\rangle{} = N\mu'_4+N(N-1){\mu'_2}^2,
\end{displaymath} (41)

as is the third term,
$\displaystyle \left\langle{\left({\sum {x_i}}\right)^4}\right\rangle{}$ $\textstyle =$ $\displaystyle N\left\langle{\sum {x_i}^4}\right\rangle{} +3N(N-1)\left\langle{\sum {x_i}^2{x_j}^2}\right\rangle{}$  
  $\textstyle =$ $\displaystyle N\mu'_4+3N(N-1){\mu'_2}^2.$ (42)

Combining (39)-(42) gives

$\displaystyle \left\langle{s^4}\right\rangle{}$ $\textstyle =$ $\displaystyle {1\over N^2} [N\mu'_4+N(N-1){\mu'_2}^2]-{2\over N^3} [N\mu'_4+N(N-1){\mu'_2}^2]$  
  $\textstyle \phantom{=}$ $\displaystyle \mathop{+} {1\over N^4} [N\mu'_4+3N(N-1){\mu'_2}^2]$  
  $\textstyle =$ $\displaystyle \left({{1\over N}-{2\over N^2}+{1\over N^3}}\right)\mu'_4$  
  $\textstyle \phantom{=}$ $\displaystyle +\left[{{N-1\over N}-{2(N-1)\over N^2}+{3(N-1)\over N^3}}\right]{\mu'_2}^2$  
  $\textstyle =$ $\displaystyle \left({N^2-2N+1\over N^3}\right)\mu'_4+{(N-1)(N^2-2N+3)\over N^3} {\mu'_2}^2$  
  $\textstyle =$ $\displaystyle {(N-1)[(N-1)\mu'_4+(N^2-2N+3){\mu'_2}^2]\over N^3},$ (43)

so plugging in (38) and (43) gives

$\displaystyle \mathop{\rm var}\nolimits (s^2)$ $\textstyle =$ $\displaystyle \left\langle{s^4}\right\rangle{}-\left\langle{s^2}\right\rangle{}^2$  
  $\textstyle =$ $\displaystyle {(N-1)[(N-1)\mu'_4+(N^2-2N+3){\mu'_2}^2]\over N^3}-{(N-1)^2N\over N^3}{\mu'_2}^2$  
  $\textstyle =$ $\displaystyle {N-1\over N^3} \{(N-1)\mu'_4+[(N^2-2N+3)-N(N-1)]{\mu'_2}^2\}$  
  $\textstyle =$ $\displaystyle {(N-1)[(N-1)\mu'_4-(N-3){\mu'_2}^2] \over N^3}.$ (44)

Student calculated the Skewness and Kurtosis of the distribution of $s^2$ as
$\displaystyle \gamma_1$ $\textstyle =$ $\displaystyle \sqrt{8\over N-1}$ (45)
$\displaystyle \gamma_2$ $\textstyle =$ $\displaystyle {12\over N-1}$ (46)

and conjectured that the true distribution is Pearson Type III Distribution
f(s^2) = C(s^2)^{(N-3)/2}e^{-Ns^2/2\sigma^2},
\end{displaymath} (47)

$\displaystyle \sigma^2$ $\textstyle =$ $\displaystyle {Ns^2\over N-1}$ (48)
$\displaystyle C$ $\textstyle =$ $\displaystyle {\left({N\over 2\sigma^2}\right)^{(N-1)/2}\over \Gamma\left({{\textstyle{N-1\over 2}}}\right)}.$ (49)

This was proven by R. A. Fisher.

The distribution of $s$ itself is given by

f(s)=2 {\left({N\over 2\sigma^2}\right)^{(N-1)/2}\over \Gamm...
...({{\textstyle{N-1\over 2}}}\right)} e^{-ns^2/2\sigma^2}s^{N-2}
\end{displaymath} (50)

\left\langle{s}\right\rangle{}=\sqrt{2\over N} {\Gamma\left(...
...t({{\textstyle{N-1\over 2}}}\right)} \sigma\equiv b(N) \sigma,
\end{displaymath} (51)

b(N)\equiv\sqrt{2\over N} {\Gamma\left({{\textstyle{N\over 2}}}\right)\over \Gamma\left({{\textstyle{N-1\over 2}}}\right)}.
\end{displaymath} (52)

The Moments are given by
\mu_r = \left({2\over N}\right)^{r/2} {\Gamma\left({{\textst...
...\over \Gamma\left({{\textstyle{N-1\over 2}}}\right)} \sigma^r,
\end{displaymath} (53)

and the variance is
$\displaystyle \mathop{\rm var}\nolimits (s)$ $\textstyle =$ $\displaystyle \nu_2-{\nu_1}^2 = {N-1\over N} \sigma^2-[b(N)\sigma]^2$  
  $\textstyle =$ $\displaystyle {1\over N} \left[{N-1-{2\Gamma^2\left({{\textstyle{N\over 2}}}\right)\over \Gamma^2\left({{\textstyle{N-1\over 2}}}\right)}\sigma^2}\right].$ (54)

An Unbiased Estimator of $\sigma$ is $s/b(N)$. Romanovsky showed that
b(N)=1-{3\over 4N}-{7\over 32N^2}-{139\over 51849N^3}+\ldots.
\end{displaymath} (55)

See also Correlation (Statistical), Covariance, Covariance Matrix, k-Statistic, Mean, Sample Variance


Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; and Vetterling, W. T. ``Moments of a Distribution: Mean, Variance, Skewness, and So Forth.'' §14.1 in Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd ed. Cambridge, England: Cambridge University Press, pp. 604-609, 1992.

info prev up next book cdrom email home

© 1996-9 Eric W. Weisstein