Measures of Central Tendency (College Board AP® Psychology): Study Guide
What are measures of central tendency?
Measures of central tendency are statistical tools used to describe the central or typical value of a data set
They summarize large amounts of data into a single representative score, making it easier to identify patterns and draw conclusions
There are three measures of central tendency:
The mean — the arithmetic average
The median — the middle value
The mode — the most frequently occurring value
Choosing the most appropriate measure depends on the nature of the data set and whether extreme scores (outliers) are present
Mean
The mean is calculated by adding up all the values in a data set and dividing the total by the number of values
The mean represents the arithmetic average of the data set
How to calculate the mean
Add up all values in the data set
Divide the total by the number of values
Example:
Data set: 4, 6, 7, 9
4 + 6 + 7 + 9 = 26
26 ÷ 4 = 6.5
Mean = 6.5
How to interpret the mean
The mean tells us the average score across all participants
It represents what a typical participant scored
Example:
If the mean score on an anxiety scale for a group receiving CBT is 12 and the mean for a control group is 24, this suggests that participants receiving CBT reported lower anxiety on average than those who did not
When comparing means across conditions, the larger or smaller the difference between them, the more meaningful the finding may be — though statistical significance must also be considered
When to use the mean
Use the mean when the data set does not contain extreme scores (outliers) and when all scores are reasonably close together
The mean is the most appropriate measure when data is normally distributed
Avoid the mean when the data set contains outliers — extreme scores will distort the mean and make it unrepresentative of the data set as a whole
Evaluation of the mean
Strengths
The mean is the most sensitive measure of central tendency as it takes every score in the data set into account
This makes it the most representative and reliable measure of central tendency when data is normally distributed
Limitations
The mean is sensitive to extreme scores (outliers) so it can only be used when the scores are reasonably close
This means that it would not be a suitable measure for some data sets
The mean score may not appear in the data set itself
E.g. a mean of 6.5 from the data set above does not correspond to any actual score in the set
Median
The median is the middle value of a data set when all values are arranged in numerical order
The median represents the positional average — the point that divides the data set exactly in half
How to calculate the median
For an odd number of values:
Arrange the values in ascending order
Identify the middle value
Example:
Data set: 20, 43, 56, 78, 92, 67, 48
Ordered: 20, 43, 48, 56, 67, 78, 92
Median = 56 (the 4th value in a set of 7)
For an even number of values:
Arrange the values in ascending order
Identify the two middle values
Add them together and divide by 2
Example:
Data set: 15, 16, 18, 19, 22, 24
Two middle values: 18 and 19
18 + 19 = 37 ÷ 2 = 18.5
Median = 18.5
How to interpret the median
The median tells us the midpoint of the data set
Half of all scores fall above it and half fall below it
Example:
If the median score on a stress scale is 15 for one group and 28 for another, this suggests that the typical participant in the second group reported considerably higher stress levels than the typical participant in the first group
The median is particularly useful for interpretation when the data set contains outliers, as it gives a more accurate picture of the typical score than the mean would
When to use the median
Use the median when the data set contains outliers or extreme scores that would distort the mean
The median is the most appropriate measure when data is skewed — when scores are not evenly distributed around the center
Evaluation of the median
Strengths
The median is not affected by outliers
This means that it gives a more accurate picture of the typical score when extreme values are present in the data set
The median is more appropriate than the mean for skewed data sets
This is because the median remains unaffected by skew and therefore gives a more accurate picture of the center of the data
Limitations
The mean does not take all scores into account
Because it only identifies the middle value, it ignores the actual values of all other scores
This makes it less sensitive than the mean
It is time-consuming to calculate with large data sets
This is because all values must be arranged in order before the median can be identified
Mode
The mode is the most frequently occurring value in a data set
The mode identifies the most common score rather than the average or middle value
How to calculate the mode
Count how many times each value appears in the data set
The value that appears most frequently is the mode
Example:
Data set: 3, 3, 3, 4, 4, 5, 6, 6, 6, 6, 7, 8
6 appears four times — more than any other value
Mode = 6
A data set may have
No mode — if all values occur equally often
One mode — the most common scenario
Two modes (bimodal) — if two values occur equally often and more frequently than all others
More than two modes (multimodal) — if three or more values occur equally often
How to interpret the mode
The mode tells us the most common score in the data set
This is the value that occurred most frequently among participants
Example:
If the modal response on a survey about sleep duration is 6 hours, this tells us that more participants reported sleeping 6 hours per night than any other amount
A bimodal distribution suggests that the data set contains two distinct clusters of scores
This may indicate that two different subgroups within the sample responded differently, which is worth investigating further
When to use the mode
Use the mode when the researcher is interested in the most common or most popular value rather than the average
The mode is the only measure of central tendency that can be used with categorical data — data that falls into named categories rather than numerical values
E.g. most common eye color, most frequently chosen answer on a multiple choice question
Evaluation of the mode
Strengths
The mode is not affected by extreme values
The mode is the only measure of central tendency that can be applied to categorical data
Limitations
A data set may include two or more modes, which makes it difficult to identify a single typical value
This reduces the usefulness of the measure
The mode may be unrepresentative on small data sets
A value that happens to occur twice may be identified as the mode even if it does not reflect the typical score in the data set
Choosing the right measure
Mean | Median | Mode | |
|---|---|---|---|
What it measures | Arithmetic average | Middle value | Most frequent value |
Best used when | Data is normally distributed, no outliers | Data is skewed or contains outliers | Data is categorical or frequency-based |
Affected by outliers? | Yes | No | No |
Uses all scores? | Yes | No | No |
Most sensitive? | Yes | Moderate | Least sensitive |
Examiner Tips and Tricks
Ensure that you understand these key points:
The mean is not always the best measure of central tendency
When a data set contains outliers, the median is more appropriate because the mean will be distorted by the extreme scores
The mode is not limited to small data sets
It is the only appropriate measure for categorical data, regardless of sample size
A bimodal distribution does not mean the data is unreliable
It may simply indicate that two distinct subgroups within the sample responded differently, which is itself a meaningful finding worth investigating
Unlock more, it's free!
Was this revision note helpful?