Reliability (College Board AP® Psychology): Revision Note

Raj Bonsor

Written by: Raj Bonsor

Reviewed by: Claire Neeson

Updated on

Reliability

  • Reliability refers to the consistency of a measure or procedure

    • A study is reliable if it produces similar results when repeated under the same conditions

  • If a study is replicated and produces similar results, this demonstrates that the measure is consistent and not subject to significant fluctuation

  • Two types of reliability include:

    • Internal reliability — the extent to which a measure is consistent within itself

    • External reliability — the extent to which a measure is consistent over time and across different occasions

  • Reliability is essential to the scientific process in psychology

    • Replication is the primary means by which researchers verify that findings are consistent and not the result of chance or error

  • Unreliable findings cannot be confidently used to draw conclusions about psychological phenomena, and are unlikely to survive the peer review process

Reliability across research methods

  • Different research methods vary in their level of reliability:

    • Lab experiments tend to be the most reliable

      • They use standardized procedures, controlled conditions, and random assignment, making them easier to replicate and producing quantitative data that can be directly compared across studies

    • Field experiments are less reliable than lab experiments

      • Although they implement an IV and produce quantitative data, they are subject to uncontrolled extraneous variables that are difficult to replicate exactly

    • Natural experiments and quasi-experiments are less reliable still

      • The naturally occurring IV cannot be controlled or replicated by the researcher, meaning conditions are unlikely to be identical across replications

    • Observational studies, surveys, and interviews vary in reliability depending on how well the procedure is standardized and how clearly variables are operationally defined

Measuring reliability

  • There are three main methods for measuring reliability, each suited to a different type of research:

    • the test-retest method

    • the split-half method

    • inter-rater reliability

Test-retest method

  • The test-retest method measures external reliability:

    • The same participants complete the same measure on two separate occasions, with a time gap between sessions (e.g. six months)

    • If each participant produces a similar score on both occasions, external reliability is established — the measure is consistent over time

    • Used to assess the reliability of surveys, questionnaires, and psychological scales

Split-half method

  • The split-half method measures internal reliability:

    • The researcher divides the measure in half and compares participants' responses to the first half with their responses to the second half

    • If similar responses are given across both halves, internal reliability is established — the measure is consistent within itself

    • Used to assess the internal consistency of surveys and psychological scales

Inter-rater reliability

  • Inter-rater reliability measures the level of consistency between two or more trained observers independently recording the same observation

  • How it is established:

    • All observers agree on the behavioral categories and how they will be recorded before the observation begins

    • Each observer conducts the observation independently to avoid one influencing the other

    • After the observation, the two independent data sets are compared

    • A correlation is calculated between the two sets of scores — a strong positive correlation indicates good inter-rater reliability

    • If inter-rater reliability is low, behavioral categories are reviewed and refined before the observation is repeated

  • Good inter-rater reliability reduces the risk that researcher bias has distorted the findings

Improving reliability

  • If reliability is measured and found to be low, the researcher must take steps to improve it before the study is conducted or repeated

    • The appropriate improvement strategy depends on the research method being used

Lab and field experiments

  • Ensure all aspects of the procedure are fully standardized

    • Same instructions, same environment, same materials, same timing across all conditions

  • Ensure the IV and DV are clearly operationally defined so the study can be precisely replicated

Observational studies

  • Ensure behavioral categories are clearly operationally defined and measure only directly observable behavior

  • Ensure behavioral categories are mutually exclusive with no overlap or ambiguity

  • Use more than one observer and establish inter-rater reliability before the main observation begins

Surveys

  • Run the test-retest method and revise or remove any questions that produce inconsistent scores across sessions

  • Replace ambiguous open questions with clearly worded closed questions or Likert scale items that are less open to interpretation

Interviews

  • Use the same interviewer across all participants to reduce variability in delivery

  • Ensure interviewers are trained and follow a consistent approach

  • Remove leading questions, double-barreled questions, and ambiguous wording from the interview schedule

Reliability & the evolution of scientific conclusions

  • Reliability is fundamental to how psychological conclusions evolve through peer review and replication:

    • When a study is submitted for peer review, other experts in the field evaluate whether the methodology is sufficiently reliable to support the conclusions drawn

    • If a study cannot be replicated or produces inconsistent results, its findings will be challenged or rejected during peer review

    • When multiple independent replications of a study produce consistent findings, confidence in those conclusions increases — this is how psychological knowledge is built and refined over time

    • Unreliable findings, even if statistically significant, cannot contribute meaningfully to the scientific evidence base, because they cannot be consistently reproduced

Examiner Tips and Tricks

Ensure that you understand these key points:

  • Reliability and validity are not the same thing — a measure can be reliable without being valid

    • E.g. a bathroom scale that consistently overestimates weight by 5 pounds is reliable but not valid

  • A study does not need to produce identical results to be considered reliable — some variation is expected

    • Reliability requires that results are similar, not identical, across replications

  • Inter-rater reliability does not guarantee validity — two observers can consistently agree on what they are recording while still recording the wrong thing if the behavioral categories are poorly designed

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Raj Bonsor

Author: Raj Bonsor

Expertise: Psychology & Sociology Content Creator

Raj joined Save My Exams in 2024 as a Senior Content Creator for Psychology & Sociology. Prior to this, she spent fifteen years in the classroom, teaching hundreds of GCSE and A Level students. She has experience as Subject Leader for Psychology and Sociology, and her favourite topics to teach are research methods (especially inferential statistics!) and attachment. She has also successfully taught a number of Level 3 subjects, including criminology, health & social care, and citizenship.

Claire Neeson

Reviewer: Claire Neeson

Expertise: Psychology Content Creator

Claire has been teaching for 34 years, in the UK and overseas. She has taught GCSE, A-level and IB Psychology which has been a lot of fun and extremely exhausting! Claire is now a freelance Psychology teacher and content creator, producing textbooks, revision notes and (hopefully) exciting and interactive teaching materials for use in the classroom and for exam prep. Her passion (apart from Psychology of course) is roller skating and when she is not working (or watching 'Coronation Street') she can be found busting some impressive moves on her local roller rink.