What is variance?
Let's start with a relatively simple conceptual explanation of variance. The key ideas are expected value and error. Suppose you randomly select a single score from a population of possible values. Let's suppose furthermore that the population of values can be described with a normal distribution. There is actually no need to suppose a normal distribution, but it makes the explanation relatively easy to follow.
As you probably know, the normal distribution is centered around its mean value, which is (equal to) the parameter μ. We call this parameter the population mean.
Now, we select a single random value from the population. Let's call this value X. Because we know something about the probability distribution of the population values, we are also in the position to specify an expected value for the score X. Let's use the symbol E(X) for this expected value. The value of E(X) proves (and can be proven) to be equal to the parameter μ. (Conceptually, the expectation of a variable can be considered as its long run average).
Of course, the actual value obtained will in general not be equal to the expected value, at least not if we sample from continuous distributions like the normal distribution. Let's call the difference between the value X and it's expectation E(X) = μ. an error, deviation or residual: e = X - E(X) = X - μ.
We would like to have some indication of the extent to which X differs from its expectation, especially when E(X) is estimated on the basis of a statistical model. Thus, we would like to have something like E(X - E(X)) = E(X - μ). The variance gives us such an indication, but does so in squared units, because working with the expected error itself always leads to the value 0 E(X - μ) = E(X) - E( μ) = μ - μ = 0. (This simply says that on average the error is zero; the standard explanation is that negative and positive errors cancel out in the long run).
The variance is the expected squared deviation (mean squared error) between X and its expectation: E((X - E(X))2) = E(X - μ)2), and the symbol for the population value is σ2.
Some examples of variances (remember we are talking conceptually here):
- the variance of the mean, is the expected squared deviation between a sample mean and its expectation the population mean.
- the variance of the difference between two means: the expected squared deviation between the sample difference and the population difference between two means.
- the variance of a contrast: the expected squared deviation between the sample value of the contrast and the population value of the contrast.
It's really not that complicated, I believe.
As you probably know, the normal distribution is centered around its mean value, which is (equal to) the parameter μ. We call this parameter the population mean.
Now, we select a single random value from the population. Let's call this value X. Because we know something about the probability distribution of the population values, we are also in the position to specify an expected value for the score X. Let's use the symbol E(X) for this expected value. The value of E(X) proves (and can be proven) to be equal to the parameter μ. (Conceptually, the expectation of a variable can be considered as its long run average).
Of course, the actual value obtained will in general not be equal to the expected value, at least not if we sample from continuous distributions like the normal distribution. Let's call the difference between the value X and it's expectation E(X) = μ. an error, deviation or residual: e = X - E(X) = X - μ.
We would like to have some indication of the extent to which X differs from its expectation, especially when E(X) is estimated on the basis of a statistical model. Thus, we would like to have something like E(X - E(X)) = E(X - μ). The variance gives us such an indication, but does so in squared units, because working with the expected error itself always leads to the value 0 E(X - μ) = E(X) - E( μ) = μ - μ = 0. (This simply says that on average the error is zero; the standard explanation is that negative and positive errors cancel out in the long run).
The variance is the expected squared deviation (mean squared error) between X and its expectation: E((X - E(X))2) = E(X - μ)2), and the symbol for the population value is σ2.
Some examples of variances (remember we are talking conceptually here):
- the variance of the mean, is the expected squared deviation between a sample mean and its expectation the population mean.
- the variance of the difference between two means: the expected squared deviation between the sample difference and the population difference between two means.
- the variance of a contrast: the expected squared deviation between the sample value of the contrast and the population value of the contrast.
It's really not that complicated, I believe.