Tuesday, July 2, 2019

Science is not advertising...or shouldn't be

“Six stone lighter now. I have more energy and feel healthier than I have for a long time” 
- Shirley Hardy, Atkins diet
 
Back in November 2018, a few of my colleagues read a recently-published article in Psychonomic Bulletin & Review. The article concerned the evidence for dissociable learning processes in comparative and cognitive psychology. We had all previously critiqued, in print, some part of the evidence presented. We had no particular reason to assume that the authors would agree with our critiques --- and that's fine, it's all part of the continuing debate and dialogue of science. What was perturbing was that the review had largely been written as if no such critiques existed.

In our response, (now accepted by PB&R) we coined the term testimonial review for this type of article. The term refers to a well-known technique in advertising where one promotes a product by highlighting cases that put your product in a good light. Of course, you can't scientifically evidence a claim simply by reporting the data that supports it. One has to consider both the evidence for, and against. You weigh the evidence and come to a conclusion. Good science involves showing your working, so one would expect this process of weighing evidence to be part of any scientific review paper. We call this a balanced review.
 
Testimonial reviews are not good science. They are potentially misleading, and may result in others basing their own work around the incorrect assumption that a particular issue is resolved. Science isn't advertising ... or, at least, it shouldn't be.
 

Thursday, May 2, 2019

h = 22

My Google Scholar h-index reaches 22, about 10 months after it reached 21. Steady progress, I guess :-)

Thursday, February 14, 2019

Seven ways to fix the replication crisis

I gave a talk yesterday that was an opinionated survey of seven causes of the replication crisis in psychology, and seven actions we could all take today to avoid it in future. All the slides are on github. In brief:

1. Publication bias
Publication bias comes in part from null results being meaningless with  traditional statistics. Use Bayes Factors instead, they can provide evidence for the null, and are easy to do in R.

2. Small sample size
Most of us do not collect enough data in our experiments. Use a power calculation to work out an appropriate sample size. This is easy to do in R.

3. Misunderstanding statistics
No-one in psychology really understands p values. Also, a p value between .04 and .05 is strangely common in psychology, yet p-values in this range provide only very weak evidence. Use Bayes Factors instead

4. Low reproducibility
If you run a different experiment to me, and do different analysis, is it that surprising you get a different answer? Ensure your work is reproducible by publishing your raw data, analysis scripts, stimuli, and experiment code.

5. ‘p’ hacking
Common practices in flexible analysis, like testing for significance after every 10 participants, and stopping when it's significant, can lead to false positive rates of about 60%. Pre-register your next big study, so you don't fool yourself.

6. Poor project management

Most psychologists do not have adequate private archiving and recording within their own labs. Use a version control system (e.g. github) to improve project management in your lab.

7. Publication norms
Pressure to publish lots of papers leads to lots of poor outputs, rather than a few good ones. Publish fewer, better papers. If you are a manager, focus hiring, promotion, and appraisal less on volume and more on quality.

CC-BY-SA 4.0

Monday, February 11, 2019

PsychoPy on Raspberry Pi

The problem
 
My department is fortunate to have several multi-seat testing rooms for psychological research. The downside is the computers inside them are slightly ageing. They're several-year-old desktop machines with integrated graphics that were originally Windows 7, but have been converted to Windows 10.

Since that conversion, Psychopy, a great open-source experiment generator has been experiencing intermittent issues with graphics-related freezes. It's not all machines, and not all the time, but sporadically they hang for 4-5 seconds before updating the screen. This is bad news for some experiments. Psychopy does not officially support integrated graphics, so our attempts to get this resolved with the developers have so far met with limited success.

Some solutions I didn't go for

1. Upgrade the machines

A £30 discrete graphics card would probably do the trick, but with the number of machines we have across the department, that's still quite a cost overall.

2.  Boot to Linux

We've never been able to replicate this hanging issue on any Linux machine, so it seems Windows specific. Unfortunately, booting from USB is disabled on these machines.


3. Use Linux laptops

Our lab is probably only testing six people at any one time, so we could buy a set of laptops for this purpose and just move them into the testing rooms when we test. This would work, but is potentially a bit expensive (perhaps £2000).

The solution I'm now trying:

4. Use Raspberry Pis

Raspberry Pis are cheap, and the testing rooms already have monitors, keyboards and mice in them (connected to the desktop machines). So, total cost per seat is £51.75. That's for a Raspberry Pi 3, case, power supply, 8GB SD card, official power supply, and HDMI to DVI cable.

The Psychopy programs I've tested so far on this setup work fine.


Monday, January 21, 2019

Please stop using this graph to argue your topic is popular

It seems to have become quite common to use this sort of "number of publications" graph to argue for the importance of one's own research area:

 

The graph shows that the number of papers including a particular search term in their title, abstract, or keywords, has risen dramatically over the last few decades. In the example above, the search term in this case is "meditation OR mindfulness", following an analysis reported by Van Dam et al. (2017).
These were just some data I had easily to hand - there's no intention to imply here that mindfulness research is particularly prone to this kind of analysis.

One problem with this kind of analysis is that the number of scientific publications per year is also increasing for most disciplines. It's fairly easy to add this information to such a graph. For example, let's plot the number of papers including the word "psychology" in their title, abstract, or keywords, on the same axes:


This puts a slightly different perspective on things, and provides little support for Van Dam's claim that: "Over the past two decades, writings on mindfulness and meditation practices have saturated the public news media and scientific literature". The literature seems very far from saturated on this particular topic (depending on what 'saturated' is taken to mean - clearly not saturated in the same way that e.g. the US market for refrigerators is saturated).

If we express the number of "mindfulness OR meditation" papers as a percentage of the number of "psychology" papers, we get the following:


This graph gives us a different perspective to the one offered by Van Dam. The 'saturation' of mindfulness research in psychology was around a fairly stable low level of 1-2% from 1975 to 2000. It rose to a peak of about 6%  in 2012, but has been declining since.

Of course, there would be other, probably better ways, of calculating the 'market share' of a concept in the scientific literature than the method used here. The main point here is simply to demonstrate that raw counts are a very poor measure.

pu067