Call for retraction of "Feeling the Future"

54 Replies, 7648 Views

There's a new post on that blog about his email exchange with Bem.

https://replicationindex.wordpress.com/2...he-future/
[-] The following 3 users Like ersby's post:
  • Doug, Laird, laborde
(2018-01-21, 07:58 AM)ersby Wrote: There's a new post on that blog about his email exchange with Bem.

https://replicationindex.wordpress.com/2...he-future/

Thanks for drawing this to our attention. I see this includes the data files for the experiments published in "Feeling the Future", which Daryl Bem has agreed to make freely available for analysis, and detailed information in the emails about the timing and composition of the experiments. 

There is also a comment from Bem about the issue of pilot experiments and selective reporting, flatly contradicting some of the suggestions that have been made:

"(One minor point: I did not spend $90,000 to conduct my experiments.  Almost all of the participants in my studies at Cornell were unpaid volunteers taking psychology courses that offered (or required) participation in laboratory experiments.  Nor did I discard failed experiments or make decisions on the basis of the results obtained.)

What I did do was spend a lot of time and effort preparing and discarding early versions of written instructions, stimulus sets and timing procedures.  These were pretested primarily on myself and my graduate assistants, who served repeatedly as pilot subjects. If instructions or procedures were judged to be too time consuming, confusing, or not arousing enough, they were changed before the formal experiments were begun on “real” participants.  Changes were not made on the basis of positive or negative results because we were only testing the procedures on ourselves.

When I did decide to change a formal experiment after I had started it, I reported it explicitly in my article. In several cases I wrote up the new trials as a modified replication of the prior experiment.  That’s why there are more experiments than phenomena in my article:  2 approach/avoidance experiments, 2 priming experiments, 3 habituation experiments, & 2 recall experiments.)"

Furthermore, it sounds as though Schimmack has plans to write up his analysis for formal publication, and as though a separate article on Bem's study 6 is about to appear on his blog.

It feels as though more useful information about Bem's experiments has emerged in the last couple of weeks than in the previous six years of sterile wrangling - particularly the decline effect. Having said that, Bem himself seems more interested in organising pre-registered studies in the future, than in reconsidering his original work.
[-] The following 2 users Like Guest's post:
  • Laird, laborde
A note of caution about the calculations I posted of the required number of trials to produce 9 significant experiments:

Looking again at Bem’s results, of course the number of binary trials isn’t equal to the number of participants, but is some multiple of it. Therefore it wasn’t appropriate to use the exact binomial probabilities in my calculations. It would have been better to use the normal distribution as an approximation, in working out the probability of statistical significance being maintained in going from a pilot experiment to a completed experiment. I believe the results of the calculations I posted should be roughly correct, but they shouldn’t be taken as exact.
(2018-01-21, 10:10 AM)Chris Wrote: A note of caution about the calculations I posted of the required number of trials to produce 9 significant experiments:

Looking again at Bem’s results, of course the number of binary trials isn’t equal to the number of participants, but is some multiple of it. Therefore it wasn’t appropriate to use the exact binomial probabilities in my calculations. It would have been better to use the normal distribution as an approximation, in working out the probability of statistical significance being maintained in going from a pilot experiment to a completed experiment. I believe the results of the calculations I posted should be roughly correct, but they shouldn’t be taken as exact.

I checked how sensitive the calculation was to the total number of binary trials, by comparing:
(i) pilot experiment of 10 binary trials and full study of N=100, and
(ii) pilot experiment of 20 binary trials and full study of N=200.

The answer is that most of the sensitivity comes from the fact that in a discrete distribution the p=0.05 criterion doesn't correspond to a whole number of successes, so in effect the 0.05 value has to be raised or lowered a bit. Once that effect has been factored out, the probability of a significant pilot experiment remaining significant when extended to a full experiment, depends only very weakly on the total number of trials N. (The relative difference in the value between N=100 and N=200 is only just over 1%.)

But probably in estimating the required number of participants it would be fair to remove the effect of the discrete distributions on the p=0.05 criterion. That would have the effect of lowering somewhat the estimate of the total number of participants required, say by roughly 5% - from 18,000 to 17,000 or so.
(2018-01-21, 09:05 AM)Chris Wrote: Thanks for drawing this to our attention. I see this includes the data files for the experiments published in "Feeling the Future", which Daryl Bem has agreed to make freely available for analysis, and detailed information in the emails about the timing and composition of the experiments. 

The data files include the date on which each experimental session took place, and this provides a much clearer picture of the sequence in which the experiments were done.

It also sheds light on the accusation sometimes made against Bem, that he combined together studies that were originally intended to be separate. This suggestion was based either on variations of the protocol within each experiment noted in his 2011 paper, or on an earlier publication in 2003, in which the studies are broken down into smaller units, called 101, 102, 201, 202, 203 and so one. 

My impression so far is that there is less force in this criticism than might have appeared previously. Admittedly there are problems with the earliest two studies, done in 2002, later known as Experiments 5 and 6. Experiment 5 does seem to have started out with several hypotheses in mind, and the first 50 sessions of Experiment 6 were originally associated with those of Experiment 5. 

But on the other hand, in several cases where the 2011 paper notes a change of protocol, the data files shows that the sessions were done in a continuous series without a break. That is the case for the remainder of Experiment 6 (participants 51-150), which is made up of three sections with different protocols, called 201, 202 and 203 in the 2003 publication. It's also the case for Experiment 2 (150 participants), where there was a change of protocol after the first 100 participants. The lack of a break in the series of sessions does seem consistent with the intention that they should form part of a single experiment.

In the other cases where there was a change of protocol, this did coincide with a break in the series of sessions. But those breaks were in the Summer - or in one case over Easter - and may have occurred because the participants, who were students, were away on vacation.

In considering the scope for large numbers of unreported sessions, it's also worth noting the time period covered by these data files. The recent Slate article claimed that Bem's studies spanned a decade, but that's not borne out by the dates in the files. The earliest two experiments were done in 2002, and the remainder between March 2005 and December 2009.
(2018-01-07, 07:29 PM)Chris Wrote: (1) You mention 10 attempted replications as collectively unsuccessful, but the meta-analysis you cite - Bem et al. (2015) - analysed 80 attempted replications, and claimed that as a whole they were successful even when Bem's own studies were excluded. Perhaps there's a reason you don't accept this claim, but shouldn't it be discussed?

Daryl Bem has now commented himself, referring in more detail to the meta-analysis in which the results remained significant when Bem's original results were excluded. It also mentions something I had missed before - that in the "exact replications" using software supplied by Bem, the data were encrypted to prevent them being modified by the experimenters or their assistants:
https://replicationindex.wordpress.com/2...mment-3448
[-] The following 2 users Like Guest's post:
  • Sciborg_S_Patel, Laird
I don't know how interested people are, but the discussion is continuing. 

Linda has compared the statistics of Bem's experiment 101 (published in 2003) and of the first 50 participants in his experiment 5 (published in 2011), and has concluded that - as suspected previously - they are the same. (There is a factor of 10 difference in one statistic, but I assume that's a typo.) 

Because 101 involved 6 classes of images, but the description of 5 made it sound as though there was only one class of images plus controls, Linda says "I think this may be sufficient evidence to consider calling for a retraction." 
https://replicationindex.wordpress.com/2...mment-3486
Looks like the retraction letter was sent.
(2018-02-02, 05:29 AM)Iyace Wrote: Looks like the retraction letter was sent.

Has that been announced somewhere? I can't see anything on Prof. Schimmack's blog, but I don't find it the easiest to navigate.
Yep, it's in an update right up the top of the original blog post calling for retraction.

  • View a Printable Version


Users browsing this thread: 1 Guest(s)