## What is a Kelvin?

I admit to a perverse fascination with metrology, specifically with the question of how fundamental units of measure are defined, so I enjoyed this article on the possibility of redefining the kelvin.  (Among my other forms of geekiness, I also like knowing about grammar and usage trivia, so yes, it is “kelvin,” not “Kelvin”.)

In the old days, of course, the unit of temperature was defined by setting 0 and 100 celsius to be the freezing and boiling points of water.  That’s far from ideal, since those temperatures (especially the boiling point) depend on atmospheric pressure.  We can do much better by defining the temperature scale in terms of a different pair of temperatures, namely absolute zero and the temperature of the triple point of water.  The triple point is the specific values of temperature and pressure such that liquid, gas, and solid phases coexist simultaneously.  Since there’s only one pair (T,P) such that this happens, it gives a unique temperature value that we can use to pin down our temperature scale. That’s how things are done right now: the kelvin is defined such that the triple point is exactly 273.16 K, and of course absolute zero is exactly 0 K.

So what’s wrong with this?  According to the article,

“It’s a slightly bonkers way to do it,” says de Podesta. According to the metrological code of ethics, it is bad form to grant special status to any single physical object, such as water.

This is certainly true in general.  For instance, the kilogram is defined in terms of an object, a particular hunk of metal in Paris whose mass is defined to be exactly 1 kg.  That’s clearly a bad thing: what if someone lost it?  (Or, more likely, rubbed a tiny bit of it off when removing dust?)

But this doesn’t really make sense to me as a problem with the kelvin.  Water isn’t an “object”; it’s a substance, or better yet a molecule.  At the moment, the unit of time is defined in terms of a particular type of atom, namely cesium-133, and that definition is regarded as the gold standard to which others aspire.  Why is cesium-133 OK, but H2O bad?

Although the objection-in-principle above doesn’t seem right, apparently there are some important pragmatic reasons:

‘Water’ needs to be qualified further: at present, it is defined as ‘Vienna standard mean ocean water’, a recipe that prescribes the fractions of hydrogen and oxygen isotopes to at least seven decimal places.

OK, I’ll admit that that’s a problem.

The proposed solution is to define the kelvin in relation to the joule, using the familiar relationship E = (3/2) kT  for a monatomic ideal gas.  This is fine, as long as you know Boltzmann’s constant k sufficiently accurately.  Researchers quite a while ago found a clever way of doing this, and they’re so confident about it that they wrote in their paper

If by any chance our value is shown to be in error by more than 10 parts in 106, we are prepared to eat the apparatus.

This boast is all the more impressive when you consider that the apparatus contains a lethal quantity of mercury.

The method involves ultraprecise measurements of the speed of sound, which is proportional to the rms speed of atoms and hence gives the average energy per atom of the gas.  The original work wasn’t precise enough to justify changing the definition of the kelvin, but the hope is that with improvements it will be.

Of course, then the kelvin will be defined in terms of the joule, which is itself defined in terms of the kilogram, which depends on that hunk of metal in Paris.  People are working hard on finding better ways to define the kilogram, though, so we hope that that problem will go away.

## Bunn of the North

As he did last year, my brother Andy is heading off to Siberia to study the climate of the Arctic, specifically

the transport and transformations of carbon and nutrients as they move with water from terrestrial uplands to the Arctic Ocean, a central issue as scientists struggle to understand the changing Arctic.

You should check out the web site and blog.  Among other things, you can find out why he has to travel 11000 miles to end up a mere 3000 miles from home.

## Risk-averse science funding?

Today’s New York Times has an article headlined “Grant system leads cancer researchers to play it safe,” discussing the thesis that, in competing for grant funds, high-risk, potentially transformative ideas lose out to low-risk ideas that will lead to at most  incremental advances.  A couple of comments:

Although the article focuses on cancer research, people talk about this problem in other branches of science too.  When I served on a grant review panel for NSF not too long ago, we were explicitly advised to give special consideration to “transformative” research proposals.  If I recall correctly, NSF has started tracking the success rate of such transformative proposals, with the goal of increasing their funding rate.

Personally, I think this is a legitimate concern, but it’s possible to make too much of it.  In particular, in the fairy-tale version of science history that people (including scientists) like to tell, we  tend to give much too much weight to the single, Earth-shattering experiment and to undervalue the “merely” incremental research.  The latter is in fact most of science, and it’s really really important. It’s probably true that the funding system is weighted too much against high-risk proposals, but we shouldn’t forget the value of the low-risk “routine” stuff.

For instance, here’s how the Times article describes one of its main examples of low-risk incremental research:

Among the recent research grants awarded by the National Cancer Institute is one for a study asking whether people who are especially responsive to good-tasting food have the most difficulty staying on a diet.

Despite the Times’s scrupulous politeness, the tone of the article seems to be mocking this sort of research (and in fact this research in particular).  And it’s easy to do: Expect John McCain to tweet about this proposal the next time he wants to sneer at the idea of funding science at all.  But in fact this is potentially a useful sort of thing to study, which may lead to improvements in public health.  Yes, the improvement will be incremental, but when you put lots of increments together, you get something called progress.

## Do statistics reveal election fraud in Iran?

Some folks had the nice idea of looking at the data from the Iran election returns for signs of election fraud.  In particular, they look at the last and second-to-last digits of the totals for different candidates in different districts, to see if these data are uniformly distributed, as you’d expect.  Regarding the last digit, they conclude

The ministry provided data for 29 provinces, and we examined the number of votes each of the four main candidates — Ahmadinejad, Mousavi, Karroubi and Mohsen Rezai — is reported to have received in each of the provinces — a total of 116 numbers.

The numbers look suspicious. We find too many 7s and not enough 5s in the last digit. We expect each digit (0, 1, 2, and so on) to appear at the end of 10 percent of the vote counts. But in Iran’s provincial results, the digit 7 appears 17 percent of the time, and only 4 percent of the results end in the number 5. Two such departures from the average — a spike of 17 percent or more in one digit and a drop to 4 percent or less in another — are extremely unlikely. Fewer than four in a hundred non-fraudulent elections would produce such numbers.

The calculations are correct.  There’s about a 20% chance of getting a downward fluctuation as large as the one seen, about a 10% chance of getting an upward fluctuation as large as the one seen, and about a 3.5% chance of getting both simultaneously.

The authors then go on to consider patterns in the last two digits.

Psychologists have also found that humans have trouble generating non-adjacent digits (such as 64 or 17, as opposed to 23) as frequently as one would expect in a sequence of random numbers. To check for deviations of this type, we examined the pairs of last and second-to-last digits in Iran’s vote counts. On average, if the results had not been manipulated, 70 percent of these pairs should consist of distinct, non-adjacent digits.

Not so in the data from Iran: Only 62 percent of the pairs contain non-adjacent digits. This may not sound so different from 70 percent, but the probability that a fair election would produce a difference this large is less than 4.2 percent.

Each of these tests alone is of marginal statistical significance, it seems to me, but in combination they start to look significant.  But I don’t think that’s a fair conclusion to draw.  It seems to me that this analysis is an example of the classic error of a posteriori statistical significance.  (This fallacy must have a catchy name, but I can’t come up with it now.  If you know it, please tell me.)

This error goes like this: you notice a surprising pattern in your data, and then you calculate how unlikely that particular pattern is to have arisen.  When that probability is low, you conclude that there’s something funny going on.  The problem is that there are many different ways in which your data could look funny, and the probability that one of them will occur is much larger than the probability that a particular one of them will occur.  In fact, in a large data set, you’re pretty much guaranteed to find some sort of anomaly that, taken in isolation, looks extremely unlikely.

In this case, there are lots of things that one could have calculated instead of the probabilities for these particular outcomes.  For instance, we could have looked at the number of times the last two digits were identical, or the number of times they differed by two, three, or any given number. Odds are that at least one of those would have looked surprising, even if there’s nothing funny going on.

By the way, this is an issue that I’m worrying a lot about these days in a completely different context.  There are a number of claims of anomalous patterns in observations of the cosmic microwave background radiation.  It’s of great interest to know whether these anomalies are real, but any attempt to quantify their statistical significance runs into exactly the same problem.

## Solar sailing

I just got back from vacation, which included some long plane trips that gave me a chance to catch up on my magazine reading.  So just a couple of months late, I read this article in the Atlantic on the Planetary Society’s attempts to get funding to build a prototype solar sailing spacecraft.  For those who don’t know, the idea is to propel the ship using big sails that reflect sunlight.  Since photons carry momentum, all of those photons bouncing off of the sail will impart momentum, making the craft go.

It’s a pretty good article, but there’s one bit of it that baffles me:

Not everyone concedes even the basics. The late Thomas Gold, of Cornell's Center for Radiophysics and Space Research, had insisted that solar sailing would never work, for the same reasons you cannot have a perpetual-motion machine: Carnot's rule and the iron second law of thermodynamics. No machine can extract an unlimited supply of free energy from any source; a certain "degradation" has to occur. And the problem is even more fundamental than that, Gold argued: the beautiful Mylar blades of Cosmos 1, or 2, will be too splendid to function, period. With "a perfect mirror, the two temperatures"€”of the sails and the sun€”"will be the same," Gold reasoned. "And it follows that the mirror cannot act as a heat engine at all: no free energy can be obtained from the light."

Any time you read a description of a technical argument in a nontechnical article, you have to reverse-engineer the details of the argument from the general description.  I can’t do that here: I can’t imagine any way that the argument imputed to Gold could make sense.  First, a “perfect mirror” is precisely the sort of thing that will not reach thermal equilibrium with the Sun, since it never absorbs thermal energy from the sunlight.  Second, even if you don’t have a perfect mirror, it’s not true that such a system will eventually reach the same temperature as the Sun.  For instance, the Earth has been absorbing 6000-degree sunlight for 4 billion years and is still at a relatively comfortable 300 K.  Something similar applies to the solar sail.  The point is that the Earth-Sun system is not a closed system: both are constantly radiating energy into the much colder deep-space environment.  The entire closed system (i.e., the Universe) is very gradually tending toward thermal equilibrium, but the idea that one small part of it (the Sun and the solar sail) will themselves reach equilibrium independent of the rest of the universe is nonsense.

If there’s a serious argument lurking in there, I’d love to know what it is.

## Take that, Larry Summers

Remember when Larry Summers (then President of Harvard) stirred up a lot of anger by suggesting that the gender gap in science was due to innate differences in ability between men and women?  I was mad about his comments for reasons that had nothing to do with sexism or (to use a phrase that has long ago transitioned to meaninglessness) political correctness; what bothered me was his crime against empiricism.

Summers’s claim was an empirical one, that is, one that’s amenable to testing via data.  Not surprisingly, lots of people have looked at it over the years and tried to bring data to bear to answer it.  Summers was speculating with pristine ignorance of this, which is just plain irresponsible.  Speculating in the absence of data is a bit like sexual fantasizing: it’s fun, and everyone does it, but you really shouldn’t talk about it in public.* This is especially true if you’re a high-profile figure speaking in a public forum about a controversial subject.

Anyway, that’s all ancient history now.  Why bring it up?  Because I just saw some results from a new study that bears on this question.  It turns out that the level of unconscious stereotyping about gender and science in a country is a good predictor of the gender gap in students’ scientific performance in that country.  That’s not what Summers’s hypothesis would predict.  Thanks to Sean Carroll for drawing my attention to this.

* Joke stolen from John Baez, who used it in a completely different context, if memory serves.

## Elsevier journals

A number of scientists I know refuse to deal with journals published by the publisher Reed Elsevier, for a couple of different reasons.  First, they’re ridiculously expensive, and Elsevier sometimes adopts pricing schemes where libraries have to purchase large bundles of journals rather than just the ones they want.  As John Baez put it a while ago,

There is really no reason for us to donate our work [i.e. authorship and refereeing] to profit-making corporations who sell it back to us at exorbitant prices!

The second reason some academics were boycotting Elsevier is that the company had a sideline sponsoring international arms fairs, a business which many people find repugnant.  That’s no longer a reason to shun the company, though: they’re out of that line of work.

All of the above is at least a couple of years old.  But here’s a new reason not to like the company: for about five years, they published at least six fake journals.  These were made to look like peer-reviewed scientific journals, but they weren’t.  At least one was owned and operated by Merck, and published only articles promoting Merck’s interests.  Not surprisingly, librarians and others don’t like this.

Oh, and by the way, the company also ran New Scientist, which used to be a good pop-science magazine, into the ground.  Less important than some other considerations, but still annoying.

Should there be an organized boycott over something like this fake-journal scandal?  I don’t really know.  But I do know that I have a choice when donating my labor to journals, and I’m fully entitled to take this sort of practice into account when making that choice.  Other things being equal, I’m certainly going to steer clear of this company.  If there were an occasion in which publishing in an Elsevier journal was far better than any other option for some reason, I’d have to decide how to weigh the various factors.  Fortunately, for me that pretty much never arises: the main journals it makes sense for me to publish in are published by professional societies.  They’re reasonably priced (compared to other scientific journals) and as far as I know are free from this sort of corruption.