Central Tendency Measures

We often wish to provide summary numbers that best describe a set or distribution of scores. These include measures of location, which indicate a typical or average value for the scores; measures of dispersion, which indicate how spread out the scores are; and measures that describe the shape of the distribution.

A measure of central tendency is a measure of location; the goal is to provide a single number that best describes the values of a set of scores. The terms measure of central tendency and average are often used interchangeably, although some authors use average only to refer the arithmetic mean. Although there are many measures of central tendency, those most commonly encountered are the mode, median, and mean.

Given a set of scores, the mode is simply the score that occurs most often. If scores are grouped into classes, the mode is considered to be the midpoint of the class that contains the largest number of scores. The mode is not a very useful summary measure because it does not take into account scores that do not have the modal value; also, there may be two or more values that occur with high frequency.

The median is the middle score of a distribution. If the scores are ordered from smallest to largest, the median is the middle score. If there are an even number of scores, the median is considered to be halfway between the two middle scores. For example, given sets of scores A(6, 8, 4, 9, 11) and B (6, 8, 4, 9, 11, 14), the median of Ais 8 and of B is 8.5. For grouped data, the median is taken be the 50th percentile pointâ€”the value below which 50 percent of the scores fall. Because the median is insensitive to the values of scores at the extremes of the distribution, it is useful for characterizing distributions that include outliers, extreme scores that are quite different from the scores in the center of the distribution. The median for (6, 8, 4, 9, 11) is the same as for (6, 8, 4, 9, 11944).

The arithmetic mean is the most commonly encountered measure of central tendency. It is obtained by adding up all of the scores in the set and dividing by the number of scores. The mean depends on all the scores in the set; as a result, the mean is sensitive to extreme values. Although the medians for the sets (6, 8, 4, 9, 11) and (6, 8, 4, 9, 11944) are the same, 8, the means are 7.6 and 2394.2, respectively. Here the median does a better job of characterizing the typical value of the scores.

It is useful to think of the mean of a distribution of scores as the balance point of the distribution. Imagine that scores are represented by weights placed on a balance beam at locations corresponding to their values. Then the location of the balance point of the set of scores corresponds to the value of the mean. For example, suppose that equal weights are placed at locations 4, 6, 8, 9, and 11 units from the left edge of a weightless, rigid beam; then, if the beam is placed on a fulcrum located 7.6 units from the left edge the beam, it will balance. Another way of thinking about this is that if we find the deviation of each score from the mean, then add up all these deviations, they will sum to zero.

Another useful characteristic of the mean is that it is the value that minimizes the sum of squared deviations. That is, if we find the deviation of each score from a value M, square each deviation, then add all these squared deviations together, the sum is smaller if M is the mean than if it is for other value. The median is the value that minimizes the sum of the absolute values of the deviations.

For a symmetrical distribution of scores, the mean and median will have similar values. If the distribution of scores is skewed to the right (that is, if the distribution is asymmetrical, with a short tail on the left side of center and a longer tail on the right), the mean will be larger than the median. If the distribution is skewed to the left, so that it has a longer tail on the left side, the mean will be smaller than the median.

We are very often interested in estimating the mean of a population of scores on the basis of samples of scores selected from the population. We can use a measure of central tendency of the sample, such as the sample mean or median, as an estimator of the population mean. If we take a number of samples of the same size, these sample means and medians will vary from sample to sample because of the variability in the scores selected to be in each sample. If the population is bell-shaped (i.e., if the scores are distributed like the normal distribution), it can be shown that the sample mean is a more efficient estimator than the sample median. The mean is more efficient in the sense that the means of samples can be shown to cluster more closely around the population mean than the sample medians, and so they tend to be better estimates of the population mean. If, on the other hand, the population is asymmetric or heavy-tailed (i.e., it tends to contain extreme scores), the sample mean may not be as good an estimator as the sample median or certain kinds of trimmed means (that is, sample means obtained after discarding some of the smaller and larger scores in the sample).