Investigative Questions (College Board AP® Statistics): Study Guide
Syllabus Edition
First teaching 2026
First exams 2027
Investigative questions
What is an investigative question?
An investigative question is the question that a statistical study is designed to answer
It is the starting point of the whole statistical process
Every choice made later (how to collect data, what to calculate, what to conclude) flows from the investigative question
For example, an investigative question might be
"What is the mean daily screen time for teenagers aged 13 to 17 in this country?"
"Is there a difference between the proportion of male and female voters who support Candidate A in this state?"
What makes an investigative question valid?
A valid investigative question for a specific study should have a defined purpose
It should be clearly stated before any data are collected
It should not be changed based on the data analysis or results
A valid investigative question should be answerable
It should be posed so that the data needed to answer it can actually be collected and analyzed
An investigative question that requires data that cannot be collected (e.g. data about events that have not happened yet, or data that no one would ever truthfully report) is not a valid investigative question
Examiner Tips and Tricks
A common student mistake is to change the investigative question after looking at the data, e.g. starting a study with the question "Is the mean greater than 50?", then switching to "Is the mean less than 50?" after seeing that the sample mean is 45. Changing the question after looking at the data is sometimes called data dredging or p-hacking and is not a valid use of statistics.
What are the three components of a valid investigative question?
A valid investigative question for an AP-style problem has three components that together guide the entire study
The variables of interest
The parameter and the direction of the alternative (or the goal of estimation, for a confidence interval)
The population to which conclusions will apply
Each component plays a different role in the study
The first guides data collection
The second guides data analysis
The third guides the conclusion that can be drawn
What is the first component (variables of interest)?
The investigative question should be phrased in terms of the variable(s) of interest in the study
These are the variables that the researcher will actually measure or record
The first component should make clear what data will need to be collected
For example, if a researcher wants to compare the average daily commute time for adults in two different cities, the variables of interest are
the two cities (the categorical variable)
the daily commute time, in minutes (the quantitative variable)
What is the second component (parameter and direction)?
The second component guides the type of analysis that will be performed
For a hypothesis test, the investigative question should make clear
the parameter being investigated (e.g. a population mean, a population proportion, a difference between two means)
the direction of the alternative hypothesis such as one of the following:
not equal to (a two-sided test)
greater than (a one-sided test)
less than (a one-sided test)
association (for a chi-square test of independence)
not independent (for a chi-square test of independence)
For a confidence interval, the investigative question should make clear
the parameter being estimated
the goal of estimation (e.g. estimating the parameter within a range of potential values)
What is the third component (the population)?
The investigative question should indicate the population to which the conclusions of the study will apply
This is the same population from which the sample is selected
The population should be described in specific terms
not vague terms such as "everyone" or "people"
For an experiment that uses random assignment of treatments, the third component should also indicate that a cause-and-effect conclusion is possible
Examiner Tips and Tricks
A useful template for writing a valid investigative question for a hypothesis-testing scenario is:
"Is there convincing statistical evidence to suggest that [the parameter for population A] is [direction of alternative such as different from / greater than / less than] [the parameter for population B / a stated value] for [the specified population]?"
Practice fitting different scenarios into this template until it feels natural.
Worked Example
A nutrition researcher wants to compare the average daily protein intake (in grams) of two groups of adults in a region: those who follow a vegetarian diet and those who do not. The researcher believes the average daily protein intake is different for the two groups. The researcher will collect data from random samples of adults from each group in the region.
Determine a valid investigative question for the researcher's study. Ensure that all three components of a valid investigative question are included in your response.
Answer:
Identify the first component (the variables of interest)
The researcher is comparing two diet groups and measuring protein intake
First component: the two diet groups (vegetarian and non-vegetarian) and daily protein intake
Identify the second component (the parameter and the direction of the alternative)
The researcher believes the average (population mean) protein intake is different for the two groups
Second component: mean daily protein intake for each group; direction is different (two-sided)
Identify the third component (the population)
The study covers adults in the region, separated into the two diet groups
Third component: all adults in the region who follow a vegetarian diet and all adults in the region who do not
Combine the three components into a single investigative question
Is there convincing statistical evidence to suggest that the mean daily protein intake is different for all adults in the region who follow a vegetarian diet compared to all adults in the region who do not follow a vegetarian diet?
Unlock more, it's free!
Was this revision note helpful?