Investigative Questions (College Board AP® Statistics): Study Guide

Syllabus Edition

First teaching 2026

First exams 2027

Dan Finlay

Written by: Dan Finlay

Reviewed by: Lucy Kirkham

Updated on

Investigative questions

What is an investigative question?

  • An investigative question is the question that a statistical study is designed to answer

  • It is the starting point of the whole statistical process

    • Every choice made later (how to collect data, what to calculate, what to conclude) flows from the investigative question

  • For example, an investigative question might be

    • "What is the mean daily screen time for teenagers aged 13 to 17 in this country?"

    • "Is there a difference between the proportion of male and female voters who support Candidate A in this state?"

What makes an investigative question valid?

  • A valid investigative question for a specific study should have a defined purpose

  • It should be clearly stated before any data are collected

    • It should not be changed based on the data analysis or results

  • A valid investigative question should be answerable

  • It should be posed so that the data needed to answer it can actually be collected and analyzed

    • An investigative question that requires data that cannot be collected (e.g. data about events that have not happened yet, or data that no one would ever truthfully report) is not a valid investigative question

Examiner Tips and Tricks

A common student mistake is to change the investigative question after looking at the data, e.g. starting a study with the question "Is the mean greater than 50?", then switching to "Is the mean less than 50?" after seeing that the sample mean is 45. Changing the question after looking at the data is sometimes called data dredging or p-hacking and is not a valid use of statistics.

What are the three components of a valid investigative question?

  • A valid investigative question for an AP-style problem has three components that together guide the entire study

    1. The variables of interest

    2. The parameter and the direction of the alternative (or the goal of estimation, for a confidence interval)

    3. The population to which conclusions will apply

  • Each component plays a different role in the study

    • The first guides data collection

    • The second guides data analysis

    • The third guides the conclusion that can be drawn

What is the first component (variables of interest)?

  • The investigative question should be phrased in terms of the variable(s) of interest in the study

    • These are the variables that the researcher will actually measure or record

  • The first component should make clear what data will need to be collected

  • For example, if a researcher wants to compare the average daily commute time for adults in two different cities, the variables of interest are

    • the two cities (the categorical variable)

    • the daily commute time, in minutes (the quantitative variable)

What is the second component (parameter and direction)?

  • The second component guides the type of analysis that will be performed

  • For a hypothesis test, the investigative question should make clear

    • the parameter being investigated (e.g. a population mean, a population proportion, a difference between two means)

    • the direction of the alternative hypothesis such as one of the following:

      • not equal to (a two-sided test)

      • greater than (a one-sided test)

      • less than (a one-sided test)

      • association (for a chi-square test of independence)

      • not independent (for a chi-square test of independence)

  • For a confidence interval, the investigative question should make clear

    • the parameter being estimated

    • the goal of estimation (e.g. estimating the parameter within a range of potential values)

What is the third component (the population)?

  • The investigative question should indicate the population to which the conclusions of the study will apply

  • This is the same population from which the sample is selected

  • The population should be described in specific terms

    • not vague terms such as "everyone" or "people"

  • For an experiment that uses random assignment of treatments, the third component should also indicate that a cause-and-effect conclusion is possible

Examiner Tips and Tricks

A useful template for writing a valid investigative question for a hypothesis-testing scenario is:

"Is there convincing statistical evidence to suggest that [the parameter for population A] is [direction of alternative such as different from / greater than / less than] [the parameter for population B / a stated value] for [the specified population]?"

Practice fitting different scenarios into this template until it feels natural.

Worked Example

A nutrition researcher wants to compare the average daily protein intake (in grams) of two groups of adults in a region: those who follow a vegetarian diet and those who do not. The researcher believes the average daily protein intake is different for the two groups. The researcher will collect data from random samples of adults from each group in the region.

Determine a valid investigative question for the researcher's study. Ensure that all three components of a valid investigative question are included in your response.

Answer:

Identify the first component (the variables of interest)

  • The researcher is comparing two diet groups and measuring protein intake

First component: the two diet groups (vegetarian and non-vegetarian) and daily protein intake

Identify the second component (the parameter and the direction of the alternative)

  • The researcher believes the average (population mean) protein intake is different for the two groups

Second component: mean daily protein intake for each group; direction is different (two-sided)

Identify the third component (the population)

  • The study covers adults in the region, separated into the two diet groups

Third component: all adults in the region who follow a vegetarian diet and all adults in the region who do not

Combine the three components into a single investigative question

Is there convincing statistical evidence to suggest that the mean daily protein intake is different for all adults in the region who follow a vegetarian diet compared to all adults in the region who do not follow a vegetarian diet?

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Dan Finlay

Author: Dan Finlay

Expertise: Maths Subject Lead

Dan graduated from the University of Oxford with a First class degree in mathematics. As well as teaching maths for over 8 years, Dan has marked a range of exams for Edexcel, tutored students and taught A Level Accounting. Dan has a keen interest in statistics and probability and their real-life applications.

Lucy Kirkham

Reviewer: Lucy Kirkham

Expertise: Head of Content Creation

Lucy has been a passionate Maths teacher for over 12 years, teaching maths across the UK and abroad helping to engage, interest and develop confidence in the subject at all levels.Working as a Head of Department and then Director of Maths, Lucy has advised schools and academy trusts in both Scotland and the East Midlands, where her role was to support and coach teachers to improve Maths teaching for all.