Does NSF know how to design an experiment?

The journal Science has a news piece on an experimental program in which the National Science Foundation varied the way grant proposals are submitted and evaluated to see if there was any effect on the outcomes:

They invited the applicants to supplement their standard, 15-page project descriptions and itemized cost estimates with two-page synopses that left out the details of each proposal but underscored the central idea. All of the applicants agreed to participate.

The division assembled two peer-review panels. One rated the traditional full proposals, while the other evaluated the two-page versions, which omitted the names and affiliations of applicants.

The two panels came up with quite different ratings.

The Science piece goes on to speculate that making the applicants anonymous makes a big difference in how proposals are evaluated. They have to base this on anecdotal evidence, though: one person from a lesser-known institution got funded under the anonymized protocol but had previously been rejected under the onymized protocol. (“Onymized” isn’t a word, as far as I know, but it should be.)

They have to rely on anecdata rather than using the actual data from this experiment, for a reason that should be obvious to any middle-school science fair entrant: the experimental protocol changed two important things at once. There’s no way to tell from the results of this experiment whether the change in outcomes was due to the applicants’ anonymity or to the shortening of the proposals from 15 pages to 2.

Radically shortening proposals like this is a huge change. There’s no way for a 2-page proposal to contain a significant amount of information about the proposed research. I’d be astonished if you got similar results from reviews of the 15-page and 2-page proposals, even if you leave anonymity aside. But because of what appears to be a bizarrely flawed experimental design, I don’t know for sure, and neither does anyone else.

In fairness, Science does note that NSF plans to do another round of the study to try to separate out the two effects. But I’m baffled by the choice to put them together in the first place.

Another stupid comment from the Science piece:

Two divisions within NSF’s Directorate for Biological Sciences are already applying one insight gained from the Big Pitch: Shorter might be better.

Well I suppose it might be, but as far as I can tell this experiment provided precisely no evidence for this claim. It may or may not have shown that shorter is different: different proposals rose to the top in the two protocols. (I think it would have been astonishing if this had not been the case.) But as far as I can tell there’s no reason for thinking that either is better.

Published by

Ted Bunn

I am chair of the physics department at the University of Richmond. In addition to teaching a variety of undergraduate physics courses, I work on a variety of research projects in cosmology, the study of the origin, structure, and evolution of the Universe. University of Richmond undergraduates are involved in all aspects of this research. If you want to know more about my research, ask me! View all posts by Ted Bunn