AP®PsychologyCollege BoardStudy GuidesResearch MethodsObservational Techniques & DesignObservational Design

Observational Design (College Board AP® Psychology): Study Guide

Written by: Raj Bonsor

Reviewed by: Claire Neeson

Updated on 29 March 2026

Structured & unstructured observations

Structured observation

A structured observation is used when the researcher wants to observe specific, predetermined behaviors in a large sample or busy environment where many different behaviors are likely to occur
- Rather than recording everything that happens, the researcher focuses only on a limited set of clearly defined behaviors of interest
The emphasis in structured observation is on gathering quantitative data, e.g.
- The number of times a child displays aggressive behavior toward a peer during a 30-minute recess period
- The frequency with which drivers stop at a crosswalk when a pedestrian is waiting
- The number of times a student raises their hand to answer a question during a one-hour class

Evaluation of structured observations

Strengths

Quantitative data can be easily analyzed, presented graphically, and converted to statistics
- This is a strength as it allows trends and frequencies of behavior to be identified across large samples
- This increases the reliability of the results
Using predetermined behavioral categories keeps the researcher focused
- They can disregard behaviors that fall outside the categories ensuring that what is being recorded is directly relevant to the research aim

Limitations

Quantitative data reveals what behavior occurred but not why
- This means that structured observations lack explanatory power as they produce findings that are limited in depth and insight
Predetermined categories mean the researcher cannot record behaviors that fall outside them, even if those behaviors are interesting and relevant
- This limits the usefulness and validity of structured observations

Unstructured observation

An unstructured observation is used when the researcher wants to observe the full range of behaviors occurring in a small sample or more intimate setting where interpersonal interaction is the focus
- The researcher does not use predetermined behavioral categories — instead they record everything that occurs during the observation session
The emphasis in unstructured observation is on gathering qualitative data, e.g.
- verbal and non-verbal communication between participants
- the quality and tone of conversation (e.g. light-hearted, serious, aggressive)
- how participants use and move within the environment
Examples of research scenarios suited to unstructured observation:
- Observing how young children interact when playing with gender-stereotyped toys
- Observing how couples communicate when discussing a source of conflict

Evaluation of unstructured observations

Strengths

Unstructured observations produce rich, detailed, in-depth qualitative data that captures the complexity of behavior
- This is high in ecological validity as it reflects the genuine, unfiltered experience of the participants
The flexible, open-ended nature of unstructured observation allows the researcher to follow unexpected or particularly significant behaviors as they emerge
- This can generate new insights and research questions

Limitations

The highly subjective nature of unstructured observations increases the risk that the researcher loses objectivity
- They may become too close to the participants, succumb to confirmation bias, or unconsciously overlook behaviors that do not align with their expectations
- This reduces the reliability of the findings
Analyzing the data from unstructured observations is time-consuming and depends heavily on the researcher's interpretation
- This introduces subjectivity into the findings and reduces the validity of the published conclusions

Examiner Tips and Tricks

Structured observation is not the same as a controlled observation. Structured refers to the use of predetermined behavioral categories to record data, whereas controlled refers to the level of control the researcher exerts over the setting and procedure.

Behavioral categories & inter-rater reliability

Behavioral categories

Behavioral categories are used in structured observations to define and record the specific behaviors the researcher is interested in
Behavioral categories must:
- only include directly observable behaviors — nothing that requires inference or interpretation
- be clearly operationally defined so that there is no ambiguity about what counts as an instance of that behavior
- be mutually exclusive — each behavior should fit into only one category
Examples of well-operationalized behavioral categories in a study on aggression include:
- "Physical aggression" = punching, kicking, or shoving another person
- "Verbal aggression" = shouting, name-calling, or threatening another person
- "Non-aggressive behavior" = smiling, sharing, or cooperating with another person
Behavioral categories can be further subdivided for greater precision, e.g.
- Physical aggression directed toward a peer of the same gender
- Physical aggression directed toward a peer of a different gender

Inter-rater reliability

Even when behavioral categories are clearly defined, observations can still be affected by researcher bias
- Different observers may interpret the same behavior differently
Inter-rater reliability is the level of consistency between two or more trained observers recording the same observation independently
Inter-rater reliability is established in the following ways:
- All observers agree on the behavioral categories and how they will be recorded before the observation begins
- Each observer conducts the observation independently to avoid one observer influencing another
- After the observation, the two independent data sets are compared
- A correlation is calculated between the two sets of scores — a strong positive correlation indicates good inter-rater reliability
- If inter-rater reliability is low, the behavioral categories are reviewed and refined before the observation is conducted again
Establishing good inter-rater reliability reduces the risk that researcher bias has distorted the findings and increases confidence in the reliability of the conclusions

Evaluation of behavioral categories and inter-rater reliability

Strengths

The use of clearly defined, unambiguous behavior categories allow the researcher to record behavior objectively
- Eliminating subjectivity moves the process closer to the scientific method
Inter-rater reliability ensures that the findings are consistent across observers
- This strengthens the reliability of the data and reduces the likelihood that the findings will be challenged during peer review

Limitations

Predetermined behavioral categories may be too restrictive
- If behaviors occur during the observation that do not fit any of the categories, they cannot be recorded
- This means the findings may not accurately represent what actually occurred, reducing validity
Inter-rater reliability does not account for the possibility that observers simply guessed when scoring ambiguous behaviors
- High agreement between observers does not necessarily mean the categories are being applied correctly
- This overestimates the true reliability of the observation

Event sampling & time sampling

It can be difficult to observe and record all behaviors continuously throughout an observation session
Therefore researchers use sampling procedures to organize data collection
These include:
- event sampling
- time sampling

Event sampling

The researcher records every time a behavior from a specific behavioral category occurs throughout the entire observation session, e.g.
- recording every instance of physical aggression during a 60-minute recess period
- tallying every time a driver uses their phone while waiting at a traffic light

Time sampling

The researcher records all behaviors that occur during a set time interval at regular points throughout the observation session, e.g.
- recording all behaviors for 20 seconds every 10 minutes across a 2-hour observation
- recording all behaviors for 15 minutes every 3 hours across a two-day observation
The researcher determines which time interval is most appropriate for the specific study

Evaluation of event sampling & time sampling

Strengths

Event sampling ensures that specific behaviors will not be missed or overlooked
- Every instance is recorded as it occurs, producing a complete and accurate frequency count
Time sampling gives the researcher flexibility to record any behaviors that occur within the time window
- It also offers researchers the opportunity to record unexpected behaviors which may generate new research questions

Limitations

If too many target behaviors occur simultaneously or are particularly complex, event sampling may fail to capture all of them accurately
- This limits the validity of the method as it would not provide a true reflection of what occurred during the observation session
Time sampling may miss any behaviors that occur outside of the designated time windows
- Some behaviors may be over- or underrepresented in the findings, which limits the validity of the conclusions drawn

Examiner Tips and Tricks

Behavioral categories must describe only what can be directly seen and measured — if you cannot observe it, you cannot record it. E.g. a category like "anxious" is not acceptable, but "fidgets with hands" or "avoids eye contact" is.

Be able to distinguish between event sampling and time sampling. Event sampling records every instance of a target behavior; time sampling records all behaviors within a set time window at regular intervals; confusing the two is a common and easily avoidable mistake.

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

I would just like to say a massive thank you for putting together such a brilliant, easy to use website.I really think using this site helped me secure my top gradesin science and maths. You really did save my exams! Thank you.

Beth
IGCSE Student

This website is soooo useful and I can’t ever thank you enough for organising questions by topic like this. Furthermore, the name of the website could not have been more appropriate as it literally did SAVE MY EXAMS!

Fathima
A Level Student

Incredible! SO worth my money, the revision notes have everything I need to know and are so easy to understand. I actually enjoy revising! It makes me feel a lot more confident for my GCSEs in a few months.

Kate
GCSE Student

Absolutely brilliant, both my girls used it for A levels and GCSE. It's saves on paper copies, also beneficial exam questions ranked from easy to hard. It's removed a lot of stress from the exams.

Sameera
Parent

Just to say that your resources are the best I have seen and I have been teaching chemistry at different levels for about 40 years

Mark
Chemistry Teacher

Excellent

Observational Design (College Board AP® Psychology): Study Guide

Structured & unstructured observations

Structured observation

Evaluation of structured observations

Strengths

Limitations

Unstructured observation

Evaluation of unstructured observations

Strengths

Limitations

Examiner Tips and Tricks

Behavioral categories & inter-rater reliability

Behavioral categories

Inter-rater reliability

Evaluation of behavioral categories and inter-rater reliability

Strengths

Limitations

Event sampling & time sampling

Event sampling

Time sampling

Evaluation of event sampling & time sampling

Strengths

Limitations

Examiner Tips and Tricks

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

Author: Raj Bonsor

Reviewer: Claire Neeson