Post Reply 
stdevp( ) appears to be mislabeled
06-05-2016, 08:59 PM
Post: #29
RE: stdevp( ) appears to be mislabeled
The sample estimate of the population standard deviation - i.e., the one we are discussion that has division by N-1 - is only one way to estimate the popultion standard deviation with a statistic taken from a sample of that population.

https://en.wikipedia.org/wiki/Standard_d..._deviation is a Wikipedia article that has a pretty good summary of the various ways one estimates the population standard deviation using a sample of that population.

https://en.wikipedia.org/wiki/Notation_i...statistics is a reference for some of the standard notation in statistics.

I have taught statistics for quite a number of years; and the notation has become increasingly standardized. For the introductory inferential statistics courses, the sample standard deviation has division by N-1 and the population standard deviation has division by N. There is no ambiguity here. One doesn't always have the option of taking large samples and has to make the best of the data one can gather.

Most statistics programs are moving toward having both a sample standard deviation, which has division by N-1, and a population standard deviation which has division by N. The HP Prime Statistics 1 Var application has the notation correct; it uses sigma for the population standard deviation and sX for the sample standard deviation. If you check, you will find that sigma*sqrt(N/(N-1) = sX.

Different platforms and statistical packages have slightly different names for each of these; but the differences are made clear in the help menus. The sample estimate of the population standard deviation - whatever you want to call it - has division by N-1. The population standard deviation - the one you get by actually counting every member of the population - has division by N.

The sample estimate is just that; and estimate of the population standard deviation that makes use of the data in your sample. If you need a better estimate, take larger samples if you can. But if you cannot, you have to divide by N-1 or use another correction to your sample data.

Statistics courses are also emphasizing the difference between the descriptive PARAMETERS of a population and the corresponding sample STATISTICS that are attempting to estimate those population parameters.

Sample statistics are estimates of population parameters. Confidence intervals and the probability calculations for the various null hypotheses are concepts that emerge from the behaviors of sample statistics as sample sizes become larger and larger.

For example, if samples are being taken from a population with a normal distribution, then the standard deviation of the sample means decreases as sigma/sqrt(N) and the standard deviation of the sample standard deviations decreases as sigma/sqrt(2N).

Most good statistics courses these days have access to really nice videos that show these properties of increasing sample sizes; and these videos are crucial to teaching the concepts that lie behind the process of good sampling and using samples to calculate the probabilites that one has captured the population parameters with the sample statistics.

So the bottom line is that statistical packages should make these distinctions between population parameters and sample statistics very clear. It is an important pedagogical issue that stresses the importance of the process of sampling that keeps the focus on trying to get samples that are truly representative of the population.
Find all posts by this user
Quote this message in a reply
Post Reply 


Messages In This Thread
RE: stdevp( ) appears to be mislabeled - Mike Elzinga - 06-05-2016 08:59 PM



User(s) browsing this thread: