Introduction to Sampling Distributions (College Board AP® Statistics): Study Guide

Mark Curtis

Written by: Mark Curtis

Reviewed by: Dan Finlay

Updated on

Introduction to sampling distributions

What is the distribution of a population?

  • The population is all possible individuals that can be sampled

    • Recall that a sample of size n is taken from a population

  • You can display the population on a graph

    • e.g. a relative frequency chart, histogram, boxplot or probability distribution can be drawn

      • This shows the distribution of a population

The distribution of a population shown as a relative frequency diagram and as a boxplot.
Representing the distribution of a population
  • Parameters of the population can often be seen from the distribution of the population

    • e.g. the population mean, population standard deviation, population range, etc.

Examiner Tips and Tricks

Make sure to always talk about population parameters in context, for example replacing "the population mean" with "the mean age of all 500 students in the school".

What is the sampling distribution of a statistic?

  • Recall that, when taking samples, you need to specify

    • the sample size, n

    • the sample statistic

      • This is what it is that you are measuring from the sample

      • e.g. sample median, sample mean, sample range, etc.

  • Taking one sample of size n generates one value of the sample statistic

    • but taking many samples of size n generates many values of the sample statistic

  • If you could take all possible samples of size n from the population you would have all possible values of the sample statistic

    • This collection of all possible sample statistics is called the sampling distribution of the statistic

  • Samples in the sampling distribution of the statistic must all be:

    • taken from the same population

    • the same size

      • Changing the sample size, n, will change the sampling distribution

  • Sampling distributions are often shown on graphs

    • e.g. relative frequency charts or histograms

Diagram illustrating the process of deriving the sampling distribution from a population distribution by taking all possible random samples of size n and calculating a sample statistic, y, for each.
The process of deriving the sampling distribution

Examiner Tips and Tricks

When commenting on distributions in the exam, make it clear whether you are referring to the distribution of the population or to a sampling distribution for a particular statistic!

How can simulations be used to approximate sampling distributions?

  • In reality, it is often not possible to take all possible samples of size n to find all possible values of a sample statistic

    • This means the exact sampling distribution cannot always be generated

  • Instead, given a population, you can run simulations to approximate the sampling distribution

    • e.g. use a computer to select a random sample of size 5, find its sample median, then repeat this process 1000 more times

      • This gives an approximate sampling distribution of sample medians

      • The more repeats, the better the approximation

    • Distributions produced by simulations are sometimes called randomization distributions

Worked Example

The ages of viewers in a movie theatre for a particular movie are recorded. A data analyst selects a random sample of 5 viewers and works out their median age.

(a) If the ages in the sample are 13, 38, 25, 50 and 42, calculate the median age.

Answer:

First write the ages in ascending order

13, 25, 38, 42, 50

Then select the middle value

The median age of the sample is 38

(b) Explain how the data analyst could create the sampling distribution of the sample median, for samples of size 5.

Answer:

The data analyst would need to obtain every possible random sample of 5 viewers and compute the median of each sample

The collection of all possible sample medians gives the sampling distribution of the sample median

(c) The sampling distribution of the sample median is shown below. Explain whether or not the median age of the sample in part (a) is unusually old.

Relative frequency graph titled "Sampling distribution of the sample median" with x-axis labeled "Age" and y-axis labeled "Relative frequency." Bars peak around age 40 and taper off around age 80.

Answer:

The median age of 38 from the sample in part (a) is not unusually old

The sampling distribution of the sample median shows that samples with median ages of 38 or greater occur fairly often

For teachers

Ready to test your students on this topic?

  • Create exam-aligned tests in minutes
  • Differentiate easily with tiered difficulty
  • Trusted for all assessment types
Explore Test Builder
Test Builder in a diagram showing questions being picked from different difficulties and topics, and being downloaded as a shareable format.
Mark Curtis

Author: Mark Curtis

Expertise: Maths Content Creator

Mark graduated twice from the University of Oxford: once in 2009 with a First in Mathematics, then again in 2013 with a PhD (DPhil) in Mathematics. He has had nine successful years as a secondary school teacher, specialising in A-Level Further Maths and running extension classes for Oxbridge Maths applicants. Alongside his teaching, he has written five internal textbooks, introduced new spiralling school curriculums and trained other Maths teachers through outreach programmes.

Dan Finlay

Reviewer: Dan Finlay

Expertise: Maths Subject Lead

Dan graduated from the University of Oxford with a First class degree in mathematics. As well as teaching maths for over 8 years, Dan has marked a range of exams for Edexcel, tutored students and taught A Level Accounting. Dan has a keen interest in statistics and probability and their real-life applications.