Sampling Distributions for Differences in Sample Proportions (College Board AP® Statistics): Study Guide

Mark Curtis

Written by: Mark Curtis

Reviewed by: Dan Finlay

Updated on

Sampling distributions for differences in sample proportions

What is a one-sample problem?

  • So far we've only considered one random sample of size n being taken from one population with a population proportion of p

    • The sample proportion is p with hat on top

    • This is a one-sample problem

What is a two-sample problem?

  • If one random sample of size n subscript 1 is taken from one population with population proportion of p subscript 1

    • and a different random sample of size n subscript 2 is taken from a different population (that is independent to the first population) with population proportion of p subscript 2

      • then this is a two-sample problem

      • The sample proportions are p with hat on top subscript 1 and p with hat on top subscript 2

What is the difference in sample proportions?

  • In a two-sample problem you can compare the sample proportions from separate samples of two independent populations

    • You can look at the difference in sample proportions, p with hat on top subscript 1 minus p with hat on top subscript 2

      • e.g. if p with hat on top subscript 1 minus p with hat on top subscript 2 greater than 0 then the proportion of successes in the first sample is greater than the proportion of successes of the second sample

What is the sampling distribution for differences in sample proportions?

  • In a sample of size n subscript 1 taken from the first population

    • let X subscript 1 count the number of successes in the sample

      • so X subscript 1 is the number of successes in n subscript 1 trials

      • and each trial is either a success or a failure

  • X subscript 1 follows a binomial distribution with probability of success p subscript 1

    • where p subscript 1 is the population proportion

  • The sample proportion, p with hat on top subscript 1, is given by

    • p with hat on top subscript 1 equals X subscript 1 over n subscript 1

      • The number of successes in the sample divided by the total number of individuals in the sample

  • Similarly, for a sample of size n subscript 2 taken from the second population with X subscript 2 successes

    • The sample proportion, p with hat on top subscript 2, is given by

      • p with hat on top subscript 2 equals X subscript 2 over n subscript 2

  • If the sample sizes are large enough such that the conditions

    • n subscript 1 p subscript 1 greater or equal than 10

    • n subscript 1 open parentheses 1 minus p subscript 1 close parentheses greater or equal than 10

    • n subscript 2 p subscript 2 greater or equal than 10

    • n subscript 2 open parentheses 1 minus p subscript 2 close parentheses greater or equal than 10 are all satisfied

    • then the difference in sample proportions, p with hat on top subscript 1 minus p with hat on top subscript 2, will follow:

      • an approximate normal distribution

      • with mean p subscript 1 minus p subscript 2

      • and standard deviation square root of fraction numerator p subscript 1 open parentheses 1 minus p subscript 1 close parentheses over denominator n subscript 1 end fraction plus fraction numerator p subscript 2 open parentheses 1 minus p subscript 2 close parentheses over denominator n subscript 2 end fraction end root

    • This is the sampling distribution for the difference in sample proportions

What else should I know about the sampling distribution for differences in sample proportions?

  • You need to know that

    • The standard deviation square root of fraction numerator p subscript 1 open parentheses 1 minus p subscript 1 close parentheses over denominator n subscript 1 end fraction plus fraction numerator p subscript 2 open parentheses 1 minus p subscript 2 close parentheses over denominator n subscript 2 end fraction end root assumes sampling was done with replacement

      • If sampling without replacement, make sure both sample sizes are less than 10% of their population size to be able to use square root of fraction numerator p subscript 1 open parentheses 1 minus p subscript 1 close parentheses over denominator n subscript 1 end fraction plus fraction numerator p subscript 2 open parentheses 1 minus p subscript 2 close parentheses over denominator n subscript 2 end fraction end root

      • otherwise the standard deviation will be smaller

    • Because the distribution is approximately normal, you can use the normal distribution to calculate probabilities involving differences of sample proportions, p with hat on top subscript 1 minus p with hat on top subscript 2

      • Its standardized z-statistic is fraction numerator open parentheses p with hat on top subscript 1 minus p with hat on top subscript 2 close parentheses minus open parentheses p subscript 1 minus p subscript 2 close parentheses over denominator square root of fraction numerator p subscript 1 open parentheses 1 minus p subscript 1 close parentheses over denominator n subscript 1 end fraction plus fraction numerator p subscript 2 open parentheses 1 minus p subscript 2 close parentheses over denominator n subscript 2 end fraction end root end fraction

      • p subscript 1 and p subscript 2, the population proportions, will be given in the question

    • If the sample sizes are not large enough (i.e. the four conditions are not satisfied) then the sampling distribution is not approximately normal

      • but the mean and standard deviation formulas still hold

Examiner Tips and Tricks

The mean, p subscript 1 minus p subscript 2, and the standard deviation, square root of fraction numerator p subscript 1 open parentheses 1 minus p subscript 1 close parentheses over denominator n subscript 1 end fraction plus fraction numerator p subscript 2 open parentheses 1 minus p subscript 2 close parentheses over denominator n subscript 2 end fraction end root, are given in the exam under 'Sampling distributions for proportions', in the row called 'For two populations'.

Worked Example

In Twiggy National Park, 35% of all eagles are male and in Dusty National Park, 20% of all eagles are male. A sample of 40 eagles is taken from Twiggy National Park and a sample of 50 eagles is taken from Dusty National Park.

Find the probability that the proportion of male eagles sampled in Twiggy National Park is less than the proportion of male eagles sampled in Dusty National Park.

Answer:

Start by labeling each population

Population 1 consists of all eagles in Twiggy National Park

Population 2 consists of all eagles in Dusty National Park

The question is about one sample proportion being less than another, p with hat on top subscript 1 less than p with hat on top subscript 2

This can be rearranged into the difference of two sample proportions, p with hat on top subscript 1 minus p with hat on top subscript 2

P open parentheses p with hat on top subscript 1 less than p with hat on top subscript 2 close parentheses equals P open parentheses p with hat on top subscript 1 minus p with hat on top subscript 2 less than 0 close parentheses

The difference in sample proportions follows an approximate normal distribution with mean p with hat on top subscript 1 minus p with hat on top subscript 2 and standard deviation square root of fraction numerator p subscript 1 open parentheses 1 minus p subscript 1 close parentheses over denominator n subscript 1 end fraction plus fraction numerator p subscript 2 open parentheses 1 minus p subscript 2 close parentheses over denominator n subscript 2 end fraction end root so long as n subscript 1 p subscript 1 greater or equal than 10, n subscript 1 open parentheses 1 minus p subscript 1 close parentheses greater or equal than 10, n subscript 2 p subscript 2 greater or equal than 10 and n subscript 2 open parentheses 1 minus p subscript 2 close parentheses greater or equal than 10

Test the four conditions with n subscript 1 equals 40, p subscript 1 equals 0.35, n subscript 2 equals 50 and p subscript 2 equals 0.2

n subscript 1 p subscript 1 equals 40 cross times 0.35 equals 14 greater or equal than 10
n subscript 1 open parentheses 1 minus p subscript 1 close parentheses equals 40 cross times open parentheses 1 minus 0.35 close parentheses equals 26 greater or equal than 10
n subscript 2 p subscript 2 equals 50 cross times 0.2 equals 10 greater or equal than 10
n subscript 2 open parentheses 1 minus p subscript 2 close parentheses equals 50 cross times open parentheses 1 minus 0.2 close parentheses equals 40 greater or equal than 10

The conditions are satisfied

Substitute p subscript 1 equals 0.35 and p subscript 2 equals 0.2 into p subscript 1 minus p subscript 2

p subscript 1 minus p subscript 2 equals 0.35 minus 0.2 equals 0.15

Substitute n subscript 1 equals 40, p subscript 1 equals 0.35, n subscript 2 equals 50 and p subscript 2 equals 0.2 into square root of fraction numerator p subscript 1 open parentheses 1 minus p subscript 1 close parentheses over denominator n subscript 1 end fraction plus fraction numerator p subscript 2 open parentheses 1 minus p subscript 2 close parentheses over denominator n subscript 2 end fraction end root

table row cell square root of fraction numerator p subscript 1 open parentheses 1 minus p subscript 1 close parentheses over denominator n subscript 1 end fraction plus fraction numerator p subscript 2 open parentheses 1 minus p subscript 2 close parentheses over denominator n subscript 2 end fraction end root end cell equals cell square root of fraction numerator 0.35 cross times 0.65 over denominator 40 end fraction plus fraction numerator 0.2 cross times 0.8 over denominator 50 end fraction end root end cell row blank equals cell 0.094273538... end cell end table

From above, you want to find P open parentheses p with hat on top subscript 1 minus p with hat on top subscript 2 less than 0 close parentheses

The difference in sample proportions follows (approximately) a normal distribution with mean 0.15 and standard deviation 0.094273538...

To find the probability that the difference in sample means is less than 0, first calculate the z-score for 0

fraction numerator 0 minus 0.15 over denominator 0.094273538... end fraction equals negative 1.5911...

Then find P open parentheses Z less than negative 1.5911... close parentheses, e.g. using the normal tables

P open parentheses Z less than negative 1.5911... close parentheses equals 0.0559

The probability that the proportion of male eagles sampled in Twiggy National Park is less than the proportion of male eagles sampled in Dusty National Park is 0.0559

You've read 0 of your 5 free study guides this week

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Mark Curtis

Author: Mark Curtis

Expertise: Maths Content Creator

Mark graduated twice from the University of Oxford: once in 2009 with a First in Mathematics, then again in 2013 with a PhD (DPhil) in Mathematics. He has had nine successful years as a secondary school teacher, specialising in A-Level Further Maths and running extension classes for Oxbridge Maths applicants. Alongside his teaching, he has written five internal textbooks, introduced new spiralling school curriculums and trained other Maths teachers through outreach programmes.

Dan Finlay

Reviewer: Dan Finlay

Expertise: Maths Subject Lead

Dan graduated from the University of Oxford with a First class degree in mathematics. As well as teaching maths for over 8 years, Dan has marked a range of exams for Edexcel, tutored students and taught A Level Accounting. Dan has a keen interest in statistics and probability and their real-life applications.