In statistics we examine one attribute at a time, and the measurements of that attribute are taken separately for each observation (e.g. individual). Each measurement differs from others to a lesser or greater extent, and the chain of measurements thus produces a distribution for the attribute concerned. The simplest distribution is obtained by using the tally method.
Population by age (5 year classification), 2006

Source: Statistics Finland, Population 2007
One way of describing a distribution on a scale is to use a histogram. In a histogram, cases from the population concerned are stacked upon one another in columns representing the category measured. The figure thus obtained describes how many cases fall into each category.

The peak of the distribution refers to the point with the highest number of observations. The peak is often near the mean of the distribution.
Distributions take on many different shapes. The most commonly known distribution is the normal distributionor Gauss curve, where most observations concentrate around the mean to create a symmetrical pattern. This means that there are roughly the same number of observations deviating in either direction from the middle of the distribution. The further we move in the positive or negative direction from the mean, the smaller the number of observations. In a normal distribution half of the observations are within one standard deviation of the mean, 95% of the observations are within two standard deviations of the mean.
Most statistical models and theories have been developed precisely with a view to normal distribution. The idea is that in large groups, distributions are random, which creates a bell-shaped normal distribution. For example, height in the adult population grouped by gender is often normally distributed around the mean height.
However, not all distributions are normal. A distribution may have two peaks, or it may be skewed. In a skewed distribution the peak occurs at one end of the distribution. A two peak distribution is obtained by combining to very different groups; in the measurement of height, for example, by combining two different age groups. A good example of a skewed distribution is provided by the breakdown of incomes. The majority of income earners are at the lower end of the income distribution. In view of the overall income spread, the large groups in the middle income bracket do not earn very much. The income distribution has a long tail to the right, i.e. from the middle income bracket upwards, whereas there is no room for a tail to the left.
Taxable earned income by income category in 2005

Source: Statistics Finland, Income Distribution Statistics
The measures in a distribution are usually called variables. A variable may be continuous or categorical. In a continuous variable, all observations have their own value, whereas in a categorical variable the observations are placed into larger groups. In practice, distributions are usually presented in categorical format because that makes them easier to handle. In a sense the classification is a crude measure that overlooks minor details.
Measurement of height as a continuous variable
| Height (cm) | Observations |
| 165.5 | 1 |
| 167 | 1 |
| 169.3 | 1 |
| 170.7 | 1 |
| 172 | 1 |
| 175 | 1 |
| 176.5 | 1 |
| 180.5 | 1 |
| 181 | 1 |
| 183.7 | 1 |
Measurement of height as a categorical variable
| Height (cm) | Observations |
| 165-169 | 3 |
| 170-179 | 4 |
| 180- | 3 |
| 1.1 | 1.2 | 1.3 | 1.4 | 1.5 | 1.6 | 1.7 | 1.8 | 1.9 |
Statistics Finland
Telephone +358 9 17 341
Contact information
Copyrights and Terms of Use
Feedback |