Atal Gawande's The Checklist Manifesto is a compelling account of how the simple checklist can scaffold expertise and teamwork in complex domains. Good checklists, he says, nudge our memories, prompting us to do what we already know we should do, but risk missing.
Good checklists are also short. 5-9 "killer" items, says Gawande.
So, I got to thinking, what would be on my checklist for experiment design? Additional suggestions welcome, by email or on mastodon, but this is what I got so far:
Not sure who is an author: ICMJE Authorship definition. Bonus points: start filling in the Contributor Roles Taxonomy
Most studies are underpowered, don't be one of them. Useful: Understanding mixed effects models through data simulation. What is your recruitment plan? How will participants be enticed to take part? How will they be rewarded for taking part? Will they be brought into the research process in any way?
More from me: Power analysis for a between-sample experiment, Quantifying the benefits of using decision models with response time and accuracy data
Probably you should. How will you know that participants were affected the way you think they should have been by your intervention?
This means you should do a full dry run through the experiment, noting how long it takes, and checking that the measures you need are recorded. Bonus points: simulate participant data (some platforms like Qualtrics will do this for you) and check your proposed analysis runs on the output. Bonus points: Exercises for Lab Groups to Prevent Research Mistakes
even a statistically significant result might be meaningless. How will you calculate an effect size, and how will you gauge whether it is important? Bonus points: maximal positive controls
Maybe you made predictions. What will it mean if you get a null result? Or an intermediate result? Or any other unexpected outcomes.
Aim for your final data to be FAIR - Findable, Accessible, Interoperabe and Reusable
Imagine what your strongest critic will say when presented with your final results. Plan your defence. You might want to consider List #2
Standard flaws, and standard criticisms that you might hear about your result
-
Does your sample align with the population you want to draw inferences about? Do the stimuli you use fairly represent the situations you want to talk about?
Reading on this: Yarkoni, T. (2019, November 22). The Generalizability Crisis. https://doi.org/10.31234/osf.io/jqw35
Participants responding to the situation, not your intended treatment. Address with an adequate control
Participants responding to what they think you want. Address by keeping partipants blind to which condition they are in. Advanced: your experiment will contain an incentive structure which participants respond to. It may be that errors are more costly than successes (so you are rewarding conservative behaviour), or it may be something as simple as that you can get through the experiment more quickly if you take a certain choice (so you are rewarding fast/inaccurate responding). You should know what the incentive structure of your experiment is. The incentive structure of your experiment may be driving behaviour, as much as any deeper psychological desires or biases. Even if this isn't the case, how will you convince a critic that your experiment reveals something about human psychology beyond "participants do what they are rewarded for?"
Something to read on this: Corneille, O., & Lush, P. (2021). Sixty years after Orne’s American Psychologist article: A Conceptual Framework for Subjective Experiences Elicited by Demand Characteristics.
may not be deliberate. Could result from unintentional exploitation of researcher degrees of freedom, inadvertant signalling to participants. Partially address with preregistration.
Are the results distorted by how you recruited, or how participants differentially dropped out during the experiment? Partially address with (truly) random assignment.
Particularly a problem if both your dependent and independent measures are of the same type (e.g. survey items)
Here's a quote to keep in mind, if you do want to rest the value of the experiment on theory-testing
But I will mention one severe but useful private test – a touchstone of strong inference – that removes the necessity for third-person criticism, because it is a test that anyone can learn to carry with him for use as needed. It is our old friend the Baconian “exclusion,” but I call it “The Question.” Obviously it should be applied as much to one’s own thinking as to others’. It consists of asking in your own mind, on hearing any scientific explanation or theory put forward, “But sir, what experiment could disprove your hypothesis?” ; or, on hearing a scientific experiment described, “But sir, what hypothesis does your experiment disprove?”
You got lucky, they say. Address by replicating and/or pre-registering
You may also enjoy
- Peter Norvig: Warning Signs in Experimental Design and Interpretation
- Julia Strand: Error Tight: Exercises for Lab Groups to Prevent Research Mistakes
- Samuel J. Lord Checklist for Experimental Design (from a Cell biology and microscopy researcher)
- Ritter, F. E., Kim, J. W., Morgan, J. H., & Carlson, R. A. (2011). Practical aspects of running experiments with human participants. In Universal Access in Human-Computer Interaction. Design for All and eInclusion: 6th International Conference, UAHCI 2011, Held as Part of HCI International 2011, Orlando, FL, USA, July 9-14, 2011, Proceedings, Part I 6 (pp. 119-128). Springer Berlin Heidelberg.
- Kane, J. V. (2024). More than meets the ITT: A guide for anticipating and investigating nonsignificant results in survey experiments. Journal of Experimental Political Science, 1-16. https://doi.org/10.1017/XPS.2024.1
Created: 2021-07-10
Latest update: 2024-06-11
Repo (contains citation widget): github.com/tomstafford/psy-checklist
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.