Estimator of sample variance

Why is the best estimator of the "true" population variance not equal to the variance of the sample? That is, why is the formula:

rather than being the same as the formula for the sample variance:

?

The answer is on Wikipedia, but I've expanded it here for clarity. There is an even more beautiful explanation elsewhere on Wikipedia (this correction factor is called Bessel's correction).

What is the expected value of S? Expected value is a method in probability theory for calculating the mean of a random quantity. We're not doing probability in this class, but you can ask me if you're interested.

The first 5 lines are just re-arrangement of terms. In the 6th line, we apply 2 identities:

(by the definition of population variance)

and

(we expect the average difference between the samples and the mean to be 0)

Finally, thanks to the central limit theorem, we already know how we expect the standard deviation of the mean of a sample to be distributed (relative to the mean of the true distribution). It is just the standard error of the mean squared:

Therefore, the best estimator of the population variance from a sample is:

and