Are scientists underrepresented in Congress?

Various scientists I know have been linking to a NY Times blog post bemoaning the fact that scientists are underrepresented in the US government. The author, John Allen Paulos, is a justly celebrated advocate for numeracy, so you’d expect him to get the numbers right. But as far as I can tell, his central claim is numerically unjustified.

As evidence for this underrepresenation, Paulos writes

Among the 435 members of the House, for example, there are one physicist, one chemist, one microbiologist, six engineers and nearly two dozen representatives with medical training.

To decide if that’s underrepresentation, we have to know what population we’re comparing to. And there are three different categories mentioned here: scientists, engineers, and medical people. Let’s take them in turn.

“Pure” science .

The physicist, chemist, and microbiologist are in fact two people with Ph.D.’s and one with a masters degree.

Two Ph.D. scientists is actually an overrepresentation compared to the US population as a whole. Eyeballing a graph from a Nature article here, there were fewer than 15000 Ph.D.’s per year awarded in the sciences in the US back in the 1980s and 1990s (when most members of Congress were presumably being educated). The age cohort of people in their 50s (which I take to be the typical age of a member of Congress) has about 5 million people per year (this time eyeballing a graph from the US Census). So if all of those Ph.D.’s went to US citizens, about 0.3% of the relevant population has Ph.D.’s in science. A lot of US Ph.D.’s go to foreigners, so the real number is significantly less. Two out of 435 is about 0.45%, so there are too many Ph.D. scientists in Congress.

Presumably more people have masters degrees than Ph.D.’s, so if you define “scientist” as someone with either a masters or a Ph.D. in science, then it might be true that scientists are underrepresented in Congress. I couldn’t quickly find the relevant data on numbers of masters degrees in the sciences. In physics, it’s very few — about as many Ph.D.’s are granted as masters degrees in any given year, according to the American Institute of Physics. But it’s probably more in other disciplines.

So I’m quite prepared to believe that  having 3 out of 435 members of Congress in the category of “people with masters or Ph.D.’s in the sciences” means that that group is underrepresented. But I’m not convinced that that’s an interesting group to talk about. In particular, if you’re trying to count the number of people with some sort of advanced scientific training, it makes no sense to exclude physicians from the count.

Engineers.

The Bureau of Labor Statistics says that there are about 1.6 million engineering jobs in the US. The work force is probably something like 200 million workers, so engineers constitute less than 1% of the work force, but they’re more than 1% of the House (6/435). So engineers are overrepresented too.

Physicians.

Doctors are even more heavily overrepresented: there are about a million doctors in the US, which is about 0.5% of the work force, but “people with medical training” are about 5% of Congress. (Some of those aren’t physicians — for instance, one is a veterinarian — but most are.)

So what?

As a simple statement of fact, it is not true that scientists are underrepresented in Congress. What, then, is Paulos claiming? I can only guess that he intends to make a normative rather than a factual statement (that is, an “ought” rather than an “is”). Scientists are underrepresented in comparison to what he he thinks the number ought to be. Personally, my instinct would be to be sympathetic to such a claim. Unfortunately, he neither states this claim clearly nor provides much of an argument in support of it.

The only thing I know about the Super Bowl

(And I didn’t even know this until yesterday.)

Apparently the NFC has won the coin toss in all of the last 14 Super Bowls. As Sean Carroll points out, there’s a 1 in 8192 chance of 14 coin flips all coming out the same way, which via the usual recipe translates into a 3.8 sigma result. In the standard conventions of particle physics, you could get that published as “evidence for” the coin being unfair, but not as a “detection” of unfairness. (“Detection” generally means 5 sigmas. If I’ve done the math correctly, that requires 22 coin flips.)

But this fact isn’t really as surprising at that 1/8192 figure makes it sound. The problem is that we notice when strange things happen but not when they don’t. There’s a pretty long list of equally strange coin-flip coincidences that could have happened but didn’t:

  • The coin comes up heads (or tails) every time
  • The team that calls the toss wins (or loses) every time
  • The team that wins the toss wins (or loses) the game every time

etc. (Yes, the last one’s not entirely fair: presumably winning the toss confers some small advantage, so you wouldn’t expect 50-50 probabilities. But the advantage is surely small, and I doubt it’d be big enough to have a dramatic effect over a mere 14 flips.)

So the probability of some anomaly happening is many times more than 1/8192.

Incidentally, this sort of problem is at the heart of one of my current research interests. The question is how —  or indeed whether — to explain certain “anomalies” that people have noticed in maps of the cosmic microwave background radiation. The problem is that there’s no good way of telling whether these anomalies need explanation: they could just be chance occurrences that our brains, which have evolved to notice patterns, picked up on. Just think of all the anomalies that could have occurred in the maps but didn’t!

The 14-in-a-row statistic is problematic in another way: it involves going back into the past for precisely 14 years and then stopping. The decision to look at 14 years instead of some other number was made precisely because it yielded a surprising low-probability result. This sort of procedure can lead to very misleading conclusions. It’s more interesting to look at the whole history of the Super Bowl coin toss.

According to one Web site, the NFC has one 31 out of 45 tosses. (I’m too lazy to confirm this myself. I found lists of which team has won the toss over the years, but my ignorance of football prevents me from making use of these lists: I don’t know from memory which teams are NFC and which are AFC, and I didn’t feel like looking them all up.)  That imbalance isn’t as unlikely as 14 in a row: you’d expect an imbalance at least this severe about 1.6% of the time. But that’s well below the 5% p-value that people often use to delimit “statistical significance.” So if you believe all of those newspaper articles that report a statistically significant benefit to such-and-such a medicine, you should believe that the Super Bowl coin toss is rigged.

Contrapositively (people always say “conversely” here, but Mr. Jorgensen, my high school math teacher, would never let me get away with such an error), if you don’t believe that the Super Bowl coin toss is rigged, you should be similarly skeptical about news reports urging you to take resveratrol, or anti-oxidants, or whatever everyone’s talking about these days. (Unless you’re the Tin Man — he definitely should have listened to the advice about anti-oxidants.)

 

Elsevier responds

In a piece in the Chronicle of Higher Education, and Elsevier spokesperson defends their pricing practices:

“Over the past 10 years, our prices have been in the lowest quartile in the publishing industry,” said Alicia Wise, Elsevier’s director of universal access. “Last year our prices were lower than our competitors’.

It’d be interesting to know what metric they’re using. If you take their entire catalog of journals and average together the subscription prices, you get something like $2500 per year. That is indeed in line with typical academic journal costs. But that average is potentially misleading, since it includes a couple of thousand cheap journals that you probably don’t care about, along with a few very high-impact journals that are priced many times higher. Want Brain Resarch? It’ll cost you $24,000 per year.

Of course, in a free market system, Elsevier is allowed to charge whatever it wants. And I’m allowed to decide whether I want to participate, as a customer and more importantly as a source of free labor (writing and refereeing articles for them).

Of course, even “normal” journal prices seem kind of exorbitant. Authors submit articles for free (and in some cases pay page charges), and referees review them for free. So why do we have to pay thousands of dollars per year to subscribe to a journal? I admit that I don’t understand the economics of scholarly journals at all.

If you’re an academic, you really don’t have much choice about participating in the system, but you do have a choice about where to donate your free labor. I tend to publish in and referee for journals run by the various scholarly professional societies (American Physical Society, American Astronomical Society, Royal Astronomical Society). That way, even if the journals are the sources of exorbitant profits, at least those funds are going toward a nonprofit organization that does things I believe in.

 

Boycott Elsevier

I’ve mentioned before some of the reasons for academics not to do business with the  publisher Elsevier. A bunch of scientists boycott Elsevier journals. I’ve just signed onto the boycott myself, and I urge my colleagues to do so too.

To be honest, this is an easy stand for me to take, since there’s pretty much never a time when refusing to publish in an Elsevier journal imposes a significant cost on me. For people in fields in which Elsevier journals were clearly better / more prestigious than others, the situation’s a bit different, I suppose.

Who knows what evil lurks in the hearts of men? The Bayesian doesn’t care.

Let me tell you a story (originally inspired by this post on Allen Downey’s blog).

Frank and Betsy are wondering whether a particular coin is a fair coin (i.e., comes up heads and tails equally often when flipped).  Frank, being a go-getter type, offers to do some tests to find out. He takes the coin away, flips it a bunch of times, and eventually comes back to Betsy to report his results.

“I flipped the coin 3022 times,” he says, “and it came up heads 1583 times. That’s 72 more heads than you’d expect with a fair coin. I worked out the p-value — that is, the probability of this large an excess occurring if the coin is fair — and it’s under 1%. So we can conclude that the coin is unfair at a significance level of  1% (or ‘99% confidence’ as physicists often say).”

You can take my word for it that Frank’s done the calculation correctly (or you can check it yourself if you like). Now, I want you to consider two different possibilities:

  1. Frank is an honest man, who has followed completely orthodox (frequentist) statistical procedure. To be specific, he decided on the exact protocol for his test (including, for some reason, the decision to do 3022 trials) in advance.
  2. Frank is a scoundrel who, for some reason, wants to reach the conclusion that the coin is unfair. He comes up with a nefarious plan: he keeps flipping the coin for as long as it takes to reach that 1% significance threshold, and then he stops and reports his results.

(I thought about making up some sort of backstory to explain why scoundrel Frank would behave this way, but I couldn’t come up with anything that wasn’t stupid.)

Here are some questions for you:

  • What should Betsy conclude on the basis of the information Frank has given her?
  • Does the answer depend on whether Frank is an honest man or a scoundrel?

I should add one more bit of information: Betsy is a rational person — that is, she draws conclusions from the available evidence via Bayesian inference.

As you can guess, I’m asking these questions because I think the answers are surprising. In fact, they turn out to be surprising in two different ways.

There’s one thing we can say immediately: if Frank is a scoundrel, then the 1% significance figure is meaningless. It turns out that, if you start with a fair coin and flip it long enough, you will (with probability 1) always eventually reach 1% significance (or, for that matter, any other significance you care to name). So the fact that he reached 1% significance conveys no information in this scenario.

On the other hand, the fact that he reached 1% significance after 3022 trials does still convey some information, which Betsy will use when she performs her Bayesian inference. In fact, the conclusion Betsy draws will be exactly the same whether Frank is an honest man or a scoundrel. The reason is that, either way, the evidence Betsy uses in performing her Bayesian inference is the same, namely that there were 1583 heads in 3022 flips.

[Technical aside: if Frank is a scoundrel, and Betsy knows it, then she has some additional information about the order in which those heads and tails occurred. For instance, she knows that Frank didn’t start with an initial run of 20 heads in a row, because if he had he would have stopped long before 3022 flips. You can convince yourself that this doesn’t affect the conclusion.]

That’s surprise #1. (At least, I think it’s kind of surprising. Maybe you don’t.) From a frequentist point of view, the p-value is the main thing that matters. Once we realize that the p-value quoted by scoundrel Frank is meaningless, you might think that the whole data set is useless. But in fact, viewed rationally (i.e., using Bayesian inference), the data set means exactly the same thing as if Frank had produced it honestly.

Here’s surprise #2: for reasonable assumptions about Betsy’s prior beliefs, she should regard this evidence as increasing the probability that the coin is fair, even though Frank thinks the evidence establishes (at 1% significance) the coin’s unfairness. Moreover, even if Frank’s results had ruled out the coin’s fairness at a more stringent signficance (0.1%, 0.00001%, whatever), it’s always possible that he’ll wind up with a result that Betsy regards as evidence in favor of the coin’s fairness.

Often, we expect Bayesians and frequentists to come up with different conclusions when the evidence is weak, but we expect the difference to go away when the evidence is strong. But in fact, no matter how strong the evidence is from a frequentist point of view, it’s always possible that the Bayesian will view it in precisely the opposite way.

I’ll show you that this is true with some specific assumptions, although the conclusion applies more generally.

Suppose that Betsy’s initial belief is that 95% of coins are fair — that is, the probability P that they come up heads is exactly 0.5. Betsy has no idea what the other 5% of coins are like, so she assumes that all values of P are equally likely for them. To be precise, her prior probability density on P, the probability that the given coin comes up heads, is

Pr[P] = 0.95 δ(P-0.5) + 0.05

over the range 0 < < 1. (I’m using the Dirac delta notation here.)

The likelihood function (i.e., the probability of getting the observed evidence for any given P) is

Pr[ E | P] = A P1583 (1-P)1439.

Here A is a constant whose value doesn’t matter. (To be precise, it’s the number of possible orders in which heads and tails could have arisen.) Turning the Bayes’s theorem crank, we find that the posterior probability distribution is

Pr[P | E] = 0.964 δ(P-0.5) + B P1583 (1-P)1439.

Here B is some other constant I’m not bothering to tell you because it doesn’t matter. What does matter is the factor 0.964 in front of the delta function, which says that, in this particular case, Betsy regards Frank’s information as increasing the probability that the coin is fair from 95% to 96.4%. In other words, she initially thought that there was a 5% chance the coin was unfair, but based on Frank’s results she now thinks there’s only a 3.2% chance that it is.

It’s not surprising that a Bayesian and frequentist interpretation of the same result give different answers, but I think it’s kind of surprising that Frank and Betsy interpret the same evidence in opposite ways: Frank says it rules out the possibility that the coin is fair with high significance, but Betsy says it increases her belief that the coin is fair.  Moreover, as I mentioned before, even if Frank had adopted a more stringent criterion for significance — say 0.01% instead of 1% — the same sort of thing could happen.

If Betsy had had a different prior, this evidence might not have had the same effect, but it turns out that  you’d get the same kind of result for a pretty broad range of priors. In particular, you  could change the 95% in the prior to any value you like, and you’d still find that the evidence increases the probability that the coin is fair. Also, you could decide that the assumption of a uniform prior for the unfair coins is unrealistic. (There probably aren’t any coins that come up heads 99% of the time, for instance.) But if you changed that uniform prior to any reasonably smooth, not too sharply peaked function, it wouldn’t change the result much.

In fact, you can prove a general theorem that says essentially the following:

No matter what significance level s Frank chooses, and what Betsy’s prior is, it’s still possible to find a number of coin flips and a number of heads such that Frank rules out the possibility that the coin is fair at significance s, while Betsy regards the evidence as increasing the probability that the coin is fair.

I could write out a formal proof of this with a few equations, but instead I’ll just sketch the main idea. Let n be the number of flips and k be the number of heads. Suppose Frank is a scoundrel, flipping the coin until he reaches the desired significance and then stopping. Imagine listing all the possible pairs (n,k) at which he might stop. If you just told Betsy that Frank had stopped at one of those points, but not which one, then you’d be telling Betsy no information at all (since Frank is guaranteed to stop eventually). With that information, therefore, her posterior probability distribution would be the same as her prior. But that posterior probability distribution is also a weighted average of the posterior  probability distributions corresponding to each of the possible pairs (n,k), with weights given by the probability that Frank stops at each of those points. Since the weighted average comes out the same as the prior, some terms in the average must give a probability of the coin being fair which is greater than the prior (and some must be less).

Incidentally, in case you’re wondering, Betsy and Frank are my parents’ names, which fortuitously have the same initials as Bayesian and frequentist. My father Frank probably did learn statistics from a frequentist point of view (for which he deserves pity, not blame), but he would certainly never behave like a scoundrel.

sigmas : statistics :: __________ : astronomy

Americans above a certain age will remember that the SAT used to include a category of “analogy questions” of the form “puppy : dog :: ______ : cow.” (This is pronounced “puppy is to dog as blank is to cow,” and the answer is “calf.”) Upon reading Peter Coles’s snarky and informative blog post about the recent quasi-news about the Higgs particle, I thought of one of my own. (By the way, Peter’s post is worth reading for several reasons, not least of which is his definition of the word “compact” as it is used in particle physics.)

Answer after the jump.

Continue reading sigmas : statistics :: __________ : astronomy

Faster than light neutrinos

My brother Andy asked me what I thought of the news that the faster-than-light neutrino result had been confirmed. Like pretty much all physicists, I was very skeptical of the original result, and I’m still skeptical. Here’s what I told him:

This is a confirmation by the same group, using essentially the same technique. They’ve improved the setup in one way that should eliminate one possible source of error (specifically, they made the neutrino pulses narrower, which makes it easier to compare arrival and departure times). That is an improvement, but that wasn’t the only possible source of error, and in fact I never thought it was the most likely one. I’m still waiting for confirmation by an independent group.

Is the wavefunction physically real?

To be honest, I hate this sort of question. I don’t know what “real” means, and I always have a suspicion that the people advocating for one answer or another to this question don’t know either.

There’s a new preprint by Pusey, Barrett, and Rudolph that is being described as shedding light on this question. According to Nature News, “The wavefunction is a real physical object after all, say researchers.”

From Nature:

The debate over how to understand the wavefunction goes back to the 1920s. In the ‘Copenhagen interpretation’ pioneered by Danish physicist Niels Bohr, the wavefunction was considered a computational tool: it gave correct results when used to calculate the probability of particles having various properties, but physicists were encouraged not to look for a deeper explanation of what the wavefunction is.

Albert Einstein also favoured a statistical interpretation of the wavefunction, although he thought that there had to be some other as-yet-unknown underlying reality. But others, such as Austrian physicist Erwin Schrödinger, considered the wavefunction, at
least initially, to be a real physical object.

The Copenhagen interpretation later fell out of popularity, but the idea that the wavefunction reflects what we can know about the world, rather than physical reality, has come back into vogue in the past 15 years with the rise of quantum information theory, Valentini says.

Rudolph and his colleagues may put a stop to that trend. Their theorem effectively says that individual quantum systems must “know” exactly what state they have been prepared in, or the results of measurements on them would lead to results at odds with quantum mechanics. They declined to comment while their preprint is undergoing the journal-submission process, but say in their paper that their finding is similar to the notion that an individual coin being flipped in a biased way — for example, so that it comes up ‘heads’ six out of ten times — has the intrinsic, physical property of being biased, in contrast to the idea that the bias is simply a statistical property of many coin-flip outcomes.

As far as I can tell, the result in this paper looks technically correct, but it’s important not to read too much into it. In particular, this paper has precisely nothing to say, as far as I can tell, on the subject known as the “interpretation of quantum mechanics.”

When people argue about different interpretations of quantum mechanics, they generally agree about the actual physical content of the theory (specifically about what the theory predicts will happen in any given situation) but disagree about what the predictions mean. In particular, the wavefunction-is-real camp and the wavefunction-isn’t-real camp would do the exact same calculations, and get the exact same results, for any specific experimental setup.

This paper considers a class of theories that are physically distinct from quantum mechanics — to be specific, a certain class of “hidden-variables theories,” although not the ones that were considered in most earlier hidden-variables work — and shows that they lead to predictions that are different from quantum mechanics. Therefore, we can in principle tell by experiment whether these alternative theories are right.

This is a nice result, but it seems to me much more modest than you’d think from the Nature description. I don’t think that people in the wavefunction-isn’t-real camp believe that one of these hidden-variables theories is correct, and therefore I don’t see how this argument can convince anyone that the wavefunction is real.

I admit that I’m not up-to-date on the current literature in the foundations of quantum mechanics, but I don’t know of anyone who was advocating in favor of the particular class of theories being described in this paper, and so to me the paper has the feel of a straw-man argument.

Personally, to the limited extent that I think the question is meaningful, I think that the wavefunction is real (in the ontological sense — mathematically, everyone knows it’s complex, not real!). But this preprint doesn’t seem to me to add significantly to the weight of evidence in favor of that position.

Faster-than-light neutrino results explained?

Note: The original version of this post was completely, embarrassingly wrong. I replaced it with a new version that says pretty much the opposite. Then Louis3 in the comments pointed out that I had misunderstood the van Elburg preprint yet again, but, if I’m not mistaken, that misinterpretation on my part doesn’t fundamentally change the argument. I hope I’ve got it right now, but given my track record on this I wouldn’t blame you for being skeptical!

If you’re reading this, you almost certainly know about the recent announcement by the OPERA group of experimental results showing that neutrinos travel slightly faster than light. I didn’t write about the original result here, because I didn’t have anything original to say. I pretty much agreed with the consensus among physicists: Probably something wrong with the experiments, extraordinary claims require extraordinary evidence, Bayesian priors, wait for replication, etc.

Recently, there’s been some buzz about a preprint being circulated by Ronald van Elburg claiming to have found an error in the OPERA analysis that would explain everything. If you don’t want to slog through the preprint itself (which is short but has equations), this blog post does a good job summarizing it.

van Elburg’s claim is that the OPERA people have incorrectly calculated the time of flight of a light signal between the source and detector in the experiment. (This is a hypothetical light signal, used for reference — no actual light signal went from one place to the other.) He goes through a complicated special-relativity calculation involving switching back and forth between an Earth-fixed (“baseline”) reference frame and a reference frame attached to a GPS satellite. I don’t understand why he thinks this complicated procedure is necessary : the final result is a relationship between baseline-frame quantities, and I don’t see why you can’t just calculate it entirely in the baseline frame. But more importantly, his procedure contains an error in the application of special relativity. When this error is corrected, the discrepancy he claims to have found goes away.

As a mea culpa for getting this completely wrong initially (and also for the benefit of the students in a course I’m teaching now), I’ve written up a critique of the van Elburg preprint, in which I try to explain the error in detail. I find it cumbersome to include equations in blog posts (maybe I just haven’t installed the right tools to do it), so I’ve put the critique in a separate PDF document. I’ll just summarize the main points briefly here.

van Elburg calculates the time of flight between source and detector in the following complicated way:

  1. He relates the satellite-frame source-detector distance to the baseline-frame distance via Lorentz contraction.
  2. He calculates the flight time in the satellite frame (correctly accounting for the fact that the detector is moving in this frame — which is what he claims OPERA didn’t do).
  3. He transforms back to the baseline frame.

At the very least, this is unnecessarily complicated. The whole point of special relativity is that you can work in whatever inertial frame you want, so why jump back and forth this way, rather than just doing the calculation in the Earth frame? In fact, I originally (incorrectly) thought that he’d done the calculation correctly but in an unnecessarily cumbersome way. It turns out that it’s worse than that, though: his calculation is just plain wrong.

The main error is in his equation (5), specifically when he writes

This is supposed to relate the time of flight in the satellite frame to the time of light in the Earth frame. But the time-dilation rule doesn’t apply in this situation. It’s only correct to calculate time dilation in this simple way (multiply by gamma) if you’re talking about events that are at the same place in one of the two reference frames. The standard example is two birthdays of one of the two twins in the twin paradox. When you’re considering two birthdays of the rocket-borne twin, you’re considering two events that are at the same place in the rocket frame, and the multiply-by-gamma rule is fine.

But in this case the time intervals under consideration are times of flight. That means that they’re time intervals between one event at one place (radio wave leaves the source) and another event at another place (radio wave arrives at detector). To properly relate time intervals of this sort in two different frames, you need the full machinery of the Lorentz transformation. If you use that full machinery to convert from satellite frame to Earth frame, you find that the time of flight comes out just the way you’d expect it to if you’d done the whole calculation in the Earth frame to begin with. (Of course it had to be that way — that’s the whole point of the principle of relativity.)

Now if the OPERA people had done their analysis the way van Elburg does (jumping back and forth with wild abandon between Earth and satellite frames), and if when they were in the satellite frame they had calculated a time of flight without accounting for the detector’s motion, then they would have been making an error of essentially the sort van Elburg describes. But as far as I can tell there’s no credible evidence, either in this preprint or in the OPERA paper, that they did the analysis this way at all, let alone that they made this error.

So this explanation of the OPERA results is a non-starter. Sorry for originally stating otherwise.