Population, Sampling & Collecting Data (Edexcel GCSE Statistics: Foundation): Exam Questions

Exam code: 1ST0

2 hours22 questions
1a
1 mark

A researcher is investigating how much the employees at a large company are paid.

One hypothesis she investigates is

“Men are paid more than women”.

The researcher could find it difficult to collect information to test her hypothesis.

Give one difficulty the researcher could have when trying to find out how much each employee is paid.

1b
1 mark

State the population for this investigation.

1c
3 marks

(i) Explain the difference between primary data and secondary data.

(2)

(ii) The researcher plans to collect primary data. Give a reason why.

(1)

1d
2 marks

The researcher plans to give a questionnaire to 60 employees of the company.

She asks the first 30 males and the first 30 females who come into work one morning to complete her questionnaire.

Give one advantage and one disadvantage of this sampling method.

Advantage.......................................

Disadvantage ..................................

2
Sme Calculator
5 marks

The directors of a company want to make changes to the company's pension scheme.

The directors want to find out what the employees think about the proposed changes to the pension scheme.

The directors will collect the information by using one of two data collection methods.

Method 1: each employee will be interviewed by one of the directors.

Method 2: each employee will complete a questionnaire without filling in their name.

There are 100 employees in the company.

Discuss how appropriate each of these two data collection methods are.

3a
2 marks

Kerry is investigating whether there is a difference in the lengths of the text messages sent by boys and sent by girls at her school.

She writes the following hypothesis for the investigation.

"The length of text messages sent by girls is greater than the length of text messages sent by boys".

Kerry decides to use a census of the 800 students in her school. She is going to ask each student to record the number of characters in their last text message.

Kerry then collects this information from each student through an online database.

Part of the database is shown below.

Gender

Length of text message

1

male

73

2

F

68

3

girl

thirty five

4

boy

114,

5

boy

85

6

girl

7

M

56

8

48

boy

9

girl

5

10

G

75

11

B

41

12

girl

28

Give two reasons why Kerry must clean the data before processing it.

Reason 1: .............................

Reason 2: ............................

3b
2 marks

Discuss how Kerry’s data collection plan could affect the reliability of her conclusions.

4a
1 mark

The table shows information about houses for sale in Oxford.

Number of bedrooms

1

2

3

4

5 or more

Total

Number of houses for sale

140

300

420

240

100

1200

(Source: adapted from rightmove.co.uk (opens in a new tab))

An estate agent says the mode of the number of bedrooms for these houses is 3.

Explain how she knows this.

4b
Sme Calculator
3 marks

The estate agent wants to investigate the prices of these houses.

She takes a stratified sample of 60 houses according to the number of bedrooms.

Work out the number of houses in her sample for each number of bedrooms.

Number of bedrooms

1

2

3

4

5 or more

Number of houses in the sample

4c
3 marks

Describe how to select the 60 houses in the sample.

5a
1 mark

Jenny wants to find out what students at her school think about the after-school clubs.

Jenny is going to use a questionnaire.

Here is one of the questions she wants to put on the questionnaire.

It is great that we have a range of clubs at school, isn't it?

Yes    ☐      No     ☐     Don't know     ☐

This is not a suitable question.

Explain why.

5b
2 marks

Here is another of the questions that Jenny wants to put on the questionnaire.

How many times a week do you go to an after-school club?

1–2 ☐          2–3   ☐       4–5     ☐

Discuss whether or not this is a suitable question for the questionnaire.

6a
1 mark

Tom and Samira want to collect data on the numbers of hours students at their school spend on homework.

There are 1100 students at their school.

Tom is planning to use a random sample of 50 students.

Explain what is meant by a random sample.

6b
3 marks

Describe how Tom could use random numbers to take a random sample of the students at his school.

6c
2 marks

Samira is planning to use a stratified sample that is stratified by school year.

Comment on whether Samira's plan is appropriate.

7a
1 mark

Razwan collected data about the methods used to get to work that morning by the 25 people who work at his company.

The bar chart shows information about the methods used.

Bar chart showing transport frequency: bike 4, bus 6, walk 4, car 8, train 3. Car usage is highest, train lowest.

Which method was used by the greatest number of people?

7b
1 mark

Which method was used by half as many of the people who got to work by bus?

7c
1 mark

Razwan concludes that travelling to work by car is the most popular method used to get to work in his city.

Give one limitation of Razwan's conclusion.

8a
1 mark

The incomplete pictogram gives information about the flavour and number of ice creams sold at Pradeep’s cafe one Saturday morning.

Table showing ice cream preferences: Vanilla has 18, Strawberry has 16, and Chocolate has 0, with each circle representing 8 ice creams.

20 chocolate ice creams were sold on Saturday morning.

Complete the pictogram for the number of chocolate ice creams sold.

8b
2 marks

Work out the total number of ice creams sold on Saturday morning.

8c
Sme Calculator
2 marks

The pictogram below gives information about the flavour and number of ice creams sold at Pradeep’s cafe one Sunday morning.

Pictogram showing ice cream flavours: Vanilla 45, Strawberry 40, Chocolate 60. A circle with a cross represents 20 ice creams; a quarter circle represents 5.

Compare the number of vanilla ice creams sold in the cafe on Saturday morning with the number of vanilla ice creams sold in the cafe on Sunday morning.

Give a reason for your answer.

8d
2 marks

Pradeep wants to use the collected data to estimate how many ice creams of each flavour she will sell for the whole of next week.

Considering Pradeep’s data decide if this is appropriate.

9a
2 marks

The tables show information about the number of episodes and viewing figures for two television programs, Emmerdale and Eastenders, for the years 2015 to 2018

Emmerdale

Total number of episodes

Highest viewing figure (millions)

Lowest viewing figure (millions)

Year

2015

291

6.53

4.04

2016

308

8.03

4.95

2017

302

7.54

5.01

2018

119

7.72

5.72

Eastenders

Total number of episodes

Highest viewing figure (millions)

Lowest viewing figure (millions)

Year

2015

209

9.87

5.43

2016

210

9.47

4.83

2017

209

8.41

4.19

2018

206

7.81

4.56

(i) In which of these years did Eastenders have its greatest number of episodes?

(1)

(ii) What was the highest viewing figure for Emmerdale between 2015 and 2018?

................... million

(1)

9b
1 mark

Explain why the viewing figures in the table may not be accurate.

9c
Sme Calculator
2 marks

Compare the number of episodes for Emmerdale in 2016 with the number of episodes for Eastenders in 2016
Give a reason for your answer.

9d
2 marks

The incomplete graph shows the highest viewing figures for Emmerdale and for Eastenders between 2015 and 2018

Use the values for the highest viewing figures for Emmerdale from the table to complete the graph.

Line graph showing the highest TV viewing figures for Emmerdale and EastEnders from 2015 to 2018, ranging from 7 to 10 million viewers.
9e
1 mark

Describe the trend for the highest viewing figures for Eastenders between 2015 and 2018

10a
1 mark

The manager of a gym is reviewing the current opening times of the gym. The manager thinks that if the gym is open for more hours it will affect the number of people using the gym.

Suggest a hypothesis that the manager could use.

10b
1 mark

The manager wants to get the opinions of the people who have a membership at the gym by giving them a questionnaire.

The manager obtains a numbered list of the 1500 people with a membership and decides to take a sample of 10% of the gym members.

The manager chooses the person who is numbered 0004 as the random starting point on the list and then picks every 20th person.

Name the sampling method that the manager plans to use.

10c
Sme Calculator
3 marks

i) Give one reason why this is a good plan.

(1)

(ii) Will the manager’s plan give a 10% sample of the gym members? Give a reason for your answer.

(2)

10d
2 marks

Here is one of the questions that the manager is considering for the questionnaire.

"Do you agree that the gym should stay open for 24 hours a day?"

Suggest two improvements to this question.

10e
3 marks

The manager decides to do a pre-test of the questionnaire by giving it to a small group of people.

(i) What is it called when a questionnaire is tested in this way?

(1)

(ii) Give two reasons why the manager might do this.

(2)

10f
1 mark

Following the full survey the manager concludes that if the gym is open for 24 hours a day it will not affect the number of people using the gym.

Give a reason why it would also be appropriate for the manager to find the opinions of people who do not have a gym membership.

11a
1 mark

Chris is a manager at a theme park.

He wants to find out what food options visitors would like to be able to buy in the theme park.

State the population for this investigation.

11b
1 mark

Chris decides that he will take a convenience sample of visitors in the section of the park selling food.

Explain what is meant by a convenience sample.

11c
1 mark

Give one disadvantage of using a convenience sample.

11d
2 marks

Chris plans to use the data collection sheet below.

Type of food

Tally

Pizza

Chinese

Curry

Fish and chips

Discuss whether this data collection sheet is appropriate.

You should consider how Chris might use the data and describe any problems he might have when he uses the data collection sheet.

11e
2 marks

Chris suggests using a stem and leaf diagram to represent the data that he collects.

Discuss whether or not this would be a suitable diagram to represent his data.

12a
2 marks

Mobeen is investigating whether there is a difference in the amount of time spent reading by pupils in Green Park school and pupils at Golden Plains school.

He uses a census of all of the pupils at each school.
Each pupil is asked to record the amount of time spent reading in a week.

Mobeen then collects this information from each student through an online database.

Part of the database is shown below.

School

Time spent reading

1

Green Park

3 hours and 10 minutes

2

Golden

2.5 hours

3

GP

45

4

GREEN PARK

1h30

5

Golden Plains

3½ h

6

Green park

About 5 hours

7

Green park school

None

8

90

9

Golden plains

1.5h

Give two reasons why the data should be cleaned before processing.

12b
1 mark

Mobeen wants to compare the data for Green Park school with the data for Golden Plains school.

Once the data has been cleaned Mobeen plans to use all of the times to draw a single box plot.

Explain why this is not an appropriate thing to do.

13a
1 mark

Some researchers investigated the hand span, in centimetres, of adult pianists by their level – international, national and amateur.

The box plots below give information about the hand spans for national level and amateur level pianists.

Box plot comparing hand spans at amateur, national, and international levels, ranging from 16 to 28 cm, with sources cited below the graph.

Circle the word in the list below that describes hand span, in centimetres, as a type of data.

qualitative     ordinal     continuous     bivariate

13b
3 marks

The table gives information about the hand spans of the international level pianists.

Greatest hand span

27.4 cm

Median hand span

23.9 cm

Lower quartile

23.2 cm

Range

5.1 cm

Interquartile range

1.1 cm

Using the information in the table, draw on the grid above a box plot for the hand spans of the international level pianists.

13c
5 marks

Compare the three distributions of hand spans.
Give three comparisons and interpret two of your comparisons.

13d
Sme Calculator
3 marks

Pavel owns a music shop.
He wants to investigate the keyboard sizes used by pianists with different hand spans.
He collects data about the hand spans of the pianists who use his shop.

The table gives information about the number of these pianists with hand spans in each of four size categories.

Hand span (cm)

A
(less than 19)

B
(19 ≤ span < 22)

C
(22 ≤ span < 24)

D
(24 or more)

Number of pianists

24

65

57

14

Pavel plans to sample 20 of these pianists stratified by hand span size.

Explain how Pavel can obtain his stratified sample.
You should include details of any calculations he should use.

14a
2 marks

A town council is proposing to build a new leisure centre. Michelle is going to carry out a survey to find out what all the people in the town think of the proposal.

Michelle thinks that she should take a sample rather than a census.

Give two reasons why Michelle might think this.

14b
2 marks

Michelle plans to use the electoral register as the sampling frame.

(i) Explain what you understand by the term sampling frame.

[1]

(ii) Give one problem Michelle may have using the electoral register as the sampling frame.

[1]

14c
2 marks

Michelle intends to conduct a pilot study.

Give two reasons why it is a good idea to conduct a pilot study

14d
6 marks

Michelle is writing a plan for her investigation into people’s views on the leisure centre proposal.

Write down what Michelle should include in her plan. You should include each of the following

  • a sampling method

  • a question she could ask in her questionnaire

  • a statistical diagram she could use to show the results of the survey.

Explain why each of the things you have written down is appropriate.

15
5 marks

The table gives information about the ages of people on the electoral register in the West Midlands in December 2018

Age

17 years old

18 years old and older

Number of people

28 152

4 146 375

A researcher wanted to find out information about voting intentions in the West Midlands.

He sent a questionnaire to a sample of 10000 people on the electoral register in the West Midlands stratified by age of voter.

Describe how the researcher would have carried out this stratified sampling.

You should show any calculations that you use.

Discuss the appropriateness of this stratified sample.

16a
2 marks

The following is an extract from part of a row of a random number list.

68236 35335 71329

Use the random number list to complete the table for the first 5 random 2-digit numbers.

68

___

___

___

___

16b
1 mark

The most common blood type in the United Kingdom is O+

The percentage of people in the United Kingdom with O+ blood type is 38%

Asha uses a simulation method to estimate how many donors would be needed to find exactly 3 donors with O+ blood type.

Asha is going to use the following 2-digit numbers for her simulation.

Blood type

O+

Not O+

Random numbers

00 – 37

38 – 99

Explain why this is an appropriate way to allocate the random numbers.

16c
2 marks

Asha runs trials using her simulation method.
The result of each trial is the number of random numbers used until Asha gets exactly 3 donors with O+ blood type.
The table below shows the results of her first 4 trials.

Trial

1

2

3

4

Result

7

5

8

4

The set of random numbers used by Asha to complete the fifth trial are shown below.

60   13   12   86   73   10   98   95   43   46

Using this set of random numbers, find the result for the fifth trial.
You must make it clear how you obtain your answer.

16d
2 marks

Asha finds the mean of her 5 results and decides that the results of her simulation are sufficient to predict the number of donors needed to find at least 3 with O+ blood type in the next blood donation session.

Explain whether the method that Asha uses to predict the number of donors required is appropriate.

17a
2 marks

Norbert asked each of the students in his class to name their favourite fruit from Apple, Banana, Orange or Pear.

The results are shown below.

Banana

Orange

Apple

Banana

Pear

Apple

Apple

Banana

Orange

Pear

Apple

Banana

Apple

Apple

Apple

Orange

Apple

Pear

Banana

Banana

Fill in the tally chart for this information and complete the frequency column.

Fruit

Tally

Frequency

Apple

 

 

Banana

 

 

Orange

 

 

Pear

 

17b
1 mark

How many students are in the class?

17c
1 mark

Find the probability that this student’s favourite fruit is Orange.

17d
1 mark

Compare the number of students whose favourite fruit is Apple to the number of students whose favourite fruit is Pear.

17e
1 mark

Norbert decides to find the favourite fruit that is the mode.

Explain why the mode is an appropriate average for Norbert to find for this type of data.

17f
1 mark

Give one advantage of the tally chart over the raw data.

17g
1 mark

Norbert wants to draw a diagram to represent his results.

Choose the type of diagram from the list below that is most suitable for him to draw.

  • Scatter diagram

  • Bar chart

  • Line graph

  • Time series

18a
1 mark

Rose is investigating the number of brothers and sisters that students in her secondary school have.

To investigate this she asks 10 students in Year 8 and 10 students in Year 11 how many brothers and sisters they each have.

Assess Rose’s method for her data collection.

18b
2 marks

The vertical line graph shows the data that she collected.

Bar chart showing the frequency of siblings. The most common number is 1 sibling, frequency 7. Other frequencies: 0 (3), 2 (3), 3 (2), 4 (1), 5+ (0).

How many students have 2 or more brothers and sisters?

18c
1 mark

Write down the mode.

18d
1 mark

Rose uses her vertical line graph to conclude that no student in her school has 5 or more brothers or sisters.

Assess whether or not Rose’s conclusion is appropriate.

19a
1 mark

A theme park in Staffordshire has around 30000 visitors per day.

(Source: www.statista.com (opens in a new tab))

Navine is a manager at the theme park. Navine is investigating what visitors think about the theme park.

He is going to do a survey of visitors at the theme park.

Navine decides to question

   30 people aged under 18 and

   30 people aged 18 and over

as they leave the theme park one day.

He plans to ask them face to face what their favourite ride was.

Name this sampling method.

19b
1 mark

Describe the population for this survey

19c
1 mark

Assess Navine’s plan to get the opinions of the people who have visited the theme park.

20a
2 marks

A Science teacher wants to know the effects of revision on a student’s performance in an exam. She decides to carry out an experimental test on a group of 15 students to find out the effects of any revision.

Describe one way the teacher could carry out an experimental test.

20b
1 mark

Give one reason why the results of this experimental test could be unreliable.

21a
5 marks

Grace asked a sample of 60 people in her town if they had ever visited France or Spain.
17 people visited both France and Spain
23 people visited Spain only
33 people visited France

Draw a Venn diagram to represent this information.

21b
3 marks

Grace says

  • more than half of the people in her sample have visited France

  • therefore more than half of the people in her town have visited France

Discuss the validity of each of Grace’s comments.

22a
3 marks

The choropleth map below represents a park that has been divided into 25 squares of equal area.

Arthur has collected data about litter in the park.

The number of pieces of litter collected in each square on one Saturday morning is shown.

Grid map showing litter distribution. Key indicates litter amounts: white 0, striped 1, dotted 2, light grey 3-5, dark grey 6-8 pieces.

Use the information in the choropleth map to calculate an estimate of the total number of pieces of litter that were collected that day.

22b
2 marks

Arthur works in this park. He has been asked to decide where a new bin should be placed in the park to help reduce the amount of litter. He concludes that the new bin should be placed in the corner of the park represented by the bottom right of the choropleth map.

Assess the validity of Arthur’s conclusion with reference to the choropleth map.

22c
1 mark

Ian suggests that the method Arthur used to collect his data is not suitable to reach a reliable conclusion.

Assess whether Ian’s suggestion is correct.

Give a reason for your answer.