Someone needs to teach Will Shortz about emf

As far as I can tell, quality control for the New York Times crossword puzzle is very good but errors do creep in.

This past Friday, the clue for 37 Across was “Symbol of electromotive force.”  The intended answer: epsilon.  I’ll admit that the symbol for emf looks kind of like an oversized epsilon, but it’s not one: it’s a capital E in a script font.  In over 20 years of doing and teaching physics, I’ve never seen emf denoted by an epsilon.

Oddly enough, the last error I noticed in a Times crossword also had to do with emf.  In that case, the error was more unambiguous.  The clue was “Energy expressed in volts”, which is actually meaningless: volts aren’t a unit of energy.

The end of the shuttle era

Because I’m an astrophysicist, people often seem to think I must be a big fan of the space shuttle.  As the shuttle program comes to an end, several people have asked me whether I’m sad about its going away.

Nope!  The end of the space shuttle (and its partner in codependency, the International Space Station) can’t come soon enough, as far as I’m concerned.

As far as science is concerned, the shuttle did one great thing: it put up and maintained the Hubble Space Telescope.  I’m grateful to it for that.  But as terrific as HST is, it doesn’t go a long way towards justifying the estimated $174 billion we’ve spent on the shuttle program.

That’s not really a fair criticism, because the shuttle program’s not primarily about science, and never has been.  (People often seem not to realize this, because anything to do with space sounds all sciencey.)  There are reasons for human space flight, but scientific research isn’t a good one: sending people up into space is not and never has been a cost-effective way to do science. So if we want to evaluate whether the shuttle program has been worthwhile, we should judge it based on other goals.

One goal people often mention is the intrinsic value of exploration of the unknown.  We should send people into space for the same reason that explorers charted new continents, went to the South Pole, explored Everest, etc.  I agree with this in principle, although it’s only fair to ask about the cost-benefit ratio in any particular instance.  But shuttle launches are not exploration in any sense of that word.

We sent people to the Moon in the 60s and 70s.  Since then, we’ve sent people repeatedly back and forth to low-Earth orbit.  Discussions in recent years about the possibility of going back to the Moon have made clear how far we are from being able to do that: it’d take years and cost huge amounts of money just to repeat what we did in the 60s.

Suppose that, after Columbus had returned to Portugal after reaching the Americas, Europeans did nothing but sail back and forth between the Portuguese mainland and the island of Berlenga, a few miles off the coast.  Suppose they did that for so long that they forgot how to sail across the ocean.  That’s pretty much the kind of “exploration” did when we came back from the Moon and spent the next four decades shuttling back and forth to low-earth orbit.

Actually, that’s not quite a fair comparison: Berlenga’s too far offshore.  I should have found an island closer to the Portuguese coast.

The other reason people often cite for supporting human space flight is that it inspires young people.  It’s always said that the Apollo missions inspired a whole generation of scientists, engineers, etc.  That’s before my time, and I haven’t studied the history, but  I’m prepared to stipulate that that’s true.

Has the shuttle had a similar effect? I don’t know of any data suggesting that it has, and I’d be very surprised if it did.  I’d be willing to bet that the Mars rovers Spirit and Opportunity have been far more inspiring to the public than the shuttle program, for less than the cost of a single shuttle launch.

(According to Wikipedia, the mission that landed Spirit and Opportunity cost less than a billion.  The cost of a shuttle launch is $174 billion / 134 launches, or $1.3 billion.  NASA says that “the average cost to launch a Space Shuttle is about $450 million per mission.”  I don’t know how this is calculated, but it clearly doesn’t include all costs.)

There’s an interesting debate to be had over whether human space flight is a worthwhile thing for the US government to be doing.  With the end of the shuttle program and the International Space Station, maybe we can actually have that debate.

All photons are created equal

Here’s a puzzling question that Matt Trawick asked me today.  (By the way, I also lifted the title of this post from him.)

In a comparison of incandescent bulbs with compact fluorescent lights (CFL), the government tells us that

Incandescents also project light further [than CFLs].

The question is simple: What could this possibly mean?

Obviously one bulb will seem to “project further” than another if it’s brighter, but this seems to mean something different: they’ve already talked about the comparison of brightnesses, and then they raise this as a separate point.  So it seems to mean that, for equal brightness (however that’s defined), the incandescent projects further. Matt and I can’t think of a sense in which that’s true.

Matt says that this claim is widespread on the Web.  Here’s another example, which may shed light on what’s meant:

Since the light source is a single point, incandescents also project light further than CFLs that project a more diffuse light.

As far as I’m concerned, this just makes things worse.  If “A projects further than B” means anything, it must mean this: if the two bulbs have equal apparent brightness when seen up close, then A looks brighter than B when seen far away.  But by that definition the diffuse source will project further than the point source.

(The reason is that the point source intensity dies away like the inverse square of distance, but the diffuse source dies away more slowly.  At large distances, the difference will be small, but it always works in favor of the diffuse source.)

So does anybody know what this claim means?

Where’s the antimatter?

Something happened today that I’ll bet is unprecedented: Two unrelated articles on the front page of the New York Times were about physicists.  One has important policy and natural security implications, so I’m going to talk about the other one, which was written by Dennis Overbye.  In the print edition that Luddites like me have, it’s headlined “Why Are We Here?  Let’s Just Thank Our Lucky Muons We Are”.  The online article is headlined “A New Clue to Explain Existence.”

The article is about a new result out of Fermilab, showing roughly that matter and antimatter behave differently from each other.  (Here are slides from a talk on the results, and a technical article about them.)  I don’t have the expertise for a technical analysis of the result, so instead I’ll say a bit about why it might be interesting.

First, though, a word on antimatter.  In teaching and in casual conversations, I’ve noticed that many people seem to think that antimatter is a science-fiction term like dilithium or the flux capacitor.  Not so.  Antimatter exists.  We make it in the lab all the time.  It even exists in nature, although in “ordinary” environments like the Earth it exists only in very small quantities (some natural radioactive isotopes emit positrons, which are the antimatter version of electrons).

In fact, in almost every way antiparticles behave just like ordinary particles, just with the opposite charges: an antiproton behaves like a proton, except that the proton is positively charged and the antiproton is negatively charged; and similarly for all the other particles you can think of.   (Some uncharged particles, like the photon, are their own antiparticles — that is, they’re simultaneously matter and antimatter.  Some, like the neutron, have antiparticles that are distinct.)

This has led to a longstanding puzzle: If the laws of nature treat matter and antimatter the same way, how did the Universe come to contain huge amounts of matter and virtually no antimatter?  Why isn’t there an anti-Earth, made of anti-atoms consisting of antiprotons, antineutrons, and antielectrons (also known as positrons)?

One response to this question is to decide you don’t care!  In general, the laws of physics tell us how a system evolves from some given initial state; they don’t tell us what that initial state had to be.  Maybe the Universe was just born this way, with matter and not antimatter.

Most physicists reject that response as unsatisfying.  Here’s one reason.  We have strong evidence that the early Universe was extremely hot.  At these ultrahigh temperatures, particle collisions should have been producing tons of pairs of particles and antiparticles, and other collisions were causing those particle-antiparticle pairs to annihilate each other.  In this hot early state, there were, on average, about a billion times as many protons as there are today, and an almost exactly equal number of antiprotons.  Eventually, as the Universe cooled, collisions stopped producing these particle-antiparticle pairs, and pretty much all of the matter and antimatter annihilated each other.

This means that, if you want to believe that the Universe was born with an excess of matter over antimatter, you have to believe that it was born with a very slight excess: for every billion antiprotons, there were a billion and one protons.  That way, when the Universe cooled, and the Great Annihilation occurred, there were enough protons (and electrons and neutrons) left over to explain what we see today.  That’s a possible scenario, but it doesn’t sound very likely that the Universe would have been constructed in that way.

(By the way, this is what physicists like to call a “naturalness” or a “fine-tuning” argument. It’s a dangerous way to argue. Endless, pointless disputes can be had for the asking, simply by claiming that someone else’s pet theory is “unnatural” or “fine-tuned.”  In this case, pretty much everyone agrees about the unnaturalness, but maybe pretty much everyone is wrong!)

It’s much more satisfying to imagine that there is a cause for the prevalence of matter over antimatter.  Such a cause would have to involve a difference in the laws of physics between matter and antimatter.  There are such differences: in some ways antiparticles do behave very slightly differently from their particle counterparts.  (That’s why I threw in weaselly language such as “In almost every way” earlier.)

To explain the prevalence of matter over antimatter, it turns out that you need (among other things) a specific kind of difference between matter and antimatter known as CP violation.  CP violation was first observed in the 1960s.  (Incidentally, I had a summer research job when I was in college working for one of the discoverers.) Very roughly speaking, certain types of particles can oscillate back and forth between matter and antimatter versions.  That sounds strange, but all by itself it wouldn’t be CP violation, as long as there was a symmetry between both halves of the oscillation (matter –> antimatter and antimatter –> matter).  But there’s not.  The particles spend slightly more time in one form than in the other.  That’s CP violation.

So what’s the new result?  Basically, it’s a new example of CP violation, which doesn’t fit into the framework established in the “standard model” of particle physics.   Certain particles (a type of B meson), when they decay, are slightly more likely to decay into negatively charged muons (“matter”) than into positively charged muons (“antimatter”).  In the standard model you’d expect these particles to be equally happy to produce matter and antimatter, at least to much higher precision than the observed asymmetry.

This is important to particle physicists, but why is it important to Dennis Overbye and the New York Times?  The main reason is that the type of asymmetry seen here is a necessary step in solving the puzzle of the matter-antimatter asymmetry in the Universe — that is, why any protons and electrons were left over after the Great Annihilation.  But it’s worth emphasizing that this discovery does not actually solve that problem.  As far as I can tell, this experiment doesn’t lead to a model of the early Universe in which the matter-antimatter asymmetry is explained.  It just means there’s a glimmer of hope that, at some point in the future, such a model will be established.

To particle physicists, the result is important even if it doesn’t solve the matter asymmetry problem, simply because it appears to be “new physics.”  Since the 1970s, there has been a “standard model” of particle physics that has successfully explained the result of every experiment in the field.  While it’s great to have such a successful theory, it’s hard to make further progress without some unexplained results to try to explain.  There are lots of reasons to believe that the standard model is not the complete word on the subject of particle physics, but without experimental guidance it’s been hard to know how to proceed in finding a better theory.

One final note: As with all new results, it’s entirely possible that this one will prove to be wrong!  If the Fermilab physicists have done their analysis right, the result is significant at about 3.2 standard deviations, which means there’s less than a 0.14% chance of it occurring as a chance fluctuation.  That’s a pretty good result, but we’ll have to wait and see if it’s confirmed by other experiments.  Presumably the LHC at CERN can make a similar measurement.

I for one welcome our new alien overlords

Stephen Hawking says that we shouldn’t try to contact aliens, lest they come and attack us for our resources:

Hawking believes that contact with such a species could be devastating for humanity.

He suggests that aliens might simply raid Earth for its resources and then move on: "We only have to look at ourselves to see how intelligent life might develop into something we wouldn't want to meet. I imagine they might exist in massive ships, having used up all the resources from their home planet. Such advanced aliens would perhaps become nomads, looking to conquer and colonise whatever planets they can reach."

He concludes that trying to make contact with alien races is "a little too risky". He said: "If aliens ever visit us, I think the outcome would be much as when Christopher Columbus first landed in America, which didn't turn out very well for the Native Americans."

I can’t get too worried about this.  It seems to me that any alien civilization with the technology to get here and attack us would also have the technology to search telescopically for planets with useful resources.  We’ll probably be able to do a decent job on that ourselves within the next decade or two.  To be specific, we’ll be able to do spectroscopy on the atmospheres of lots of planets, which would give us a good idea of which ones to go to and mine —  if only we could get there.

For anyone who wants to find us, get to us, and exploit us, finding will be by far the easiest step, so this doesn’t strike me as a good argument for hiding.

Of course, there may be other reasons for not broadcasting our presence to aliens, the most obvious being that it’s a poor use of resources.  It all comes down to a cost-benefit analysis: Hawking doesn’t want to do it because of the potential cost (alien attack); I’m more concerned about the (overwhelmingly likely) lack of benefit.

v is not equal to dx/dt

In a discussion of David Hogg’s and my quixotic quest to convince people that it’s OK to think of the redshifts of distant galaxies as being due to the galaxies’ motion (that is, as a Doppler shift), Phillip Helbig writes

I think we all must agree on the following statement: Using the relativistic Doppler formula to calculate the velocity of an object at high redshift does not yield a meaningful answer in the the velocity so derived is not the temporal derivative of ANY distance used for other purposes in cosmology.

I replied to him in the comments, but I think that this point needs a longer response and might be of more general interest.

I agree with the beginning and end of Phillip’s statement, but not the middle.  To be precise, I agree that the velocity derived from the Doppler formula is not the derivative of a distance, but I don’t agree that that means it’s not a meaningful velocity.

That’s right: I’m saying a velocity is not necessarily the rate of change of a distance.  That sounds crazy: isn’t that the definition of velocity?

Well, sometimes.  But there are other times in astrophysics when a Doppler shift is measured, and nobody objects to calling the resulting quantity a velocity, even though that quantity is not the rate of change of a distance (or more generally of a position).  The clearest example I know of is a binary  star.

Here’s a cartoon spacetime diagram of an observation of a binary star.

Binary star spacetime diagram

Time increases upward on this diagram.  The blue curve represents the Earth.  The curve wobbles back and forth as the Earth orbits the Sun.  The red curve represents a star, which is orbiting another star (not shown).  The dashed curve shows the path of a photon going from the star to the observer.

This is a situation that occurs all the time in astronomy.  The observer sees the photon (many photons, actually), measures a redshift, and calls the result the velocity of the star relative to us.

Now riddle me this: What is the position function x(t) such that this velocity is dx/dt?  For that matter, at what t should this derivative be evaluated?

There is no good answer to this question.  The velocity in question is not equal to the time derivative of a position, in any useful sense.  The main reason is that the velocity in question is a relative velocity, relating motion at two different times.

If you insist on describing the measured velocity of the star as a dx/dt, here’s the best way I can think of to do it.  Define an inertial reference frame in which the Earth is at rest at the moment of observation.  Then the measured velocity is dx/dt, where (x,t) are the coordinates of the star in this frame, and the derivative is evaluated at the time of emission.  But this doesn’t meet Phillip’s criterion: the quantity x in this expression is not a “distance used for any other purpose.”  It’s certainly not in any sense the distance from the Earth to the star, for instance: at the time the derivative is evaluated, the Earth was nowhere near the right location for this to be true.

The velocity of the Earth, in some chosen reference frame, is a dx/dt, and the velocity of the star is also a dx/dt.  (Each of these two is represented by an arrow in the picture above.)  But the relative velocity of the two isn’t.  If you’re unwilling to call this quantity a velocity, then I guess you should be unwilling to call the quantity derived from a cosmological redshift a velocity.  But this seems to me a bit of a Humpty Dumpty way to talk.

More on the cosmological redshift

George Musser sent me and David Hogg an email with some questions about the paper Hogg and I wrote about the interpretation of the redshift (which I’ve written about before).  The discussion may help to clarify a bit what Hogg and I are and are not claiming, so here it is (with Musser and Hogg’s permission, of course).

Musser’s original question:

I’m still absorbing your paper from a couple of years ago on the cosmological redshift, being one of those people who has made the distinction with Doppler shift and, more generally, between “expansion of space” of “motion through space”.

If these are equivalent and, in fact, the latter is preferred, then should I think of the big bang as spraying out galaxies through space like a conventional explosion — i.e. the very picture cosmologists have been telling us is wrong all these years? If the rubber-sheet model of space is so problematic, then what picture should I keep in my head?

Also, if the photon only ever sees locally flat spacetime, is that why the cosmological redshift does not entail a loss of energy?

Hogg’s reply:

The only cosmologist saying that that the “explosion” picture is wrong is Harrison (who himself is very wrong), although others think it is uncomfortable, like Sean Carroll (who is not wrong). Empirically, there is no difference; what is definitely wrong is the idea that the space is “rubber” or has dynamics of its own. There is no absolute space–the investigator has coordinate freedom, and the empty space has no dynamics, so this rubber sheet picture is very misleading. And no, the photon does not “lose energy” in any sense. It is just has different energies for different observers, and we are all different observers on different galaxies.

Musser:

That is helpful, but I am still confused. An explosion goes off at a certain position in space and matter shoots outward in every direction. Is that really a valid picture of the big bang? What do I make of the presence of horizons?

And finally me:

It’s still true that there is no spatial center to the expansion. That is, there is no point in space that is “really” at rest with everything moving away from it. Space is homogeneous, which means that whatever point you pick looks as much like the center as any other point.

One thing that can be said for the expanding-rubber-sheet picture: at least in its form as an expanding balloon, it conveys this idea of homogeneity tolerably well. (Well, except that it’s hard for people to remember that only the surface of the balloon counts as “space” in this metaphor. People always want to think of the center of the balloon’s volume as “where” the Big Bang happened.)

So I’d rather you not think of the Big Bang as an explosion “at a certain position in space”. It’s still true that it happened everywhere rather than somewhere. There’s no preexisting space into which stuff expands. For instance, if we imagine a closed Universe (i.e., one that has a finite  volume today), its volume was smaller in the past, approaching zero as you get closer to the Big Bang. So in that sense space really is expanding.

[A bit of fine print: All of the above is true as applied to the standard model of the Universe, in which homogeneity is assumed. Whether it’s true of our actual Universe is of course an empirical question. The answer is yes, as far as we can tell so far. But there’s no way to tell — and there probably never will be any way to tell — what space is like outside of our horizon. But anyway, this point is independent of the question of interpretation that we’re discussing at the moment. So it’s safe to ignore this point for the present discussion.]

As Hogg says, the main thing we object to is the idea that the rubber sheet has its own dynamics and interacts with the stuff in the Universe — that is, that the stretching of the rubber sheet tends to pull things apart, or that it “stretches” the wavelengths of light. As far as I’m concerned, the main reason for objecting to this language is not because it gives the wrong idea about cosmology, but because it gives the wrong idea about relativity. The most important point about relativity is that space doesn’t have any such powers and abilities. If you’re a small particle whizzing through space, at every moment space looks to you just like ordinary, gravity-free, non-expanding space.

So if you’re going to abandon the heresy of the rubber sheet, what should you replace it with? I don’t have anything as catchy as the rubber sheet, unfortunately. What I visualize when I visualize the expanding Universe is just a bunch of small neighborhoods, each one of which is completely ordinary gravity-free space, but each of which is moving away from its neighbors.

In this picture, the redshift is easy to understand. If a guy in one neighborhood tosses a ball to his neighbor, the speed of the ball as measured by the catcher will be less than the speed as measured by the thrower. That is, the two measure different energies for the ball, not because there’s some phenomenon taking energy away, but just because they’re in different reference frames. If the catcher then turns around and throws again to his neighbor, the same thing happens again, and so on. That’s all the redshift is. It’s not some mysterious “stretching.”

Good or bad Bayes?

My brother Andy pointed me to this discussion on Tamino’s Open Mind blog of Bayesian vs. frequentist statistical methods.  It’s focused on a nice, clear-cut statistics problem from a textbook by David MacKay, which can be viewed in either a frequentist or Bayesian way:

We are trying to reduce the incidence of an unpleasant disease called microsoftus. Two vaccinations, A and B, are tested on a group of volunteers. Vaccination B is a control treatment, a placebo treatment with no active ingredients. Of the 40 subjects, 30 are randomly assigned to have treatment A and the other 10 are given the control treatment B. We observe the subjects for one year after their vaccinations. Of the 30 in group A, one contracts microsoftus. Of the 10 in group B, three contract microsoftus. Is treatment A better than treatment B?

Tamino reproduces MacKay’s analysis and then proceeds to criticize it in strong terms.  Tamino’s summary:

Let \theta_A be the probability of getting "microsoftus" with treatment A, while \theta_B is the probability with treatment B. He adopts a uniform prior, that all possible values of \theta_A and \theta_B are equally likely (a standard choice and a good one). "Possible" means between 0 and 1, as all probabilities must be.

He then uses the observed data to compute posterior probability distributions for \theta_A,~\theta_B. This makes it possible to computes the probability that \theta_A < \theta_B (i.e., that you’re less likely to get the disease with treatment A than with B). He concludes that the probability is 0.990, so there’s a 99% chance that treatment A is superior to treatment B (the placebo).

Tamino has a number of objections to this analysis, which I think I agree with, although I’d express things a bit differently.  To me, the problem with the above analysis is precisely the part that Tamino says is “a standard choice and a good one”: the choice of prior.

MacKay’s choice of prior expresses the idea that, before looking at the data, we thought that all possible pairs of probabilities (\theta_A,~\theta_B) were equally likely.   That prior is very unlikely to be an accurate reflection of our actual prior state of belief regarding the drug.  Before you looked at the data, you probably thought there was a non-negligible chance that the drug had no significant effect at all — that is, that the two probabilities were exactly (or almost exactly) equal. So in fact your prior probability was surely not a constant function on the  (\theta_A,~\theta_B) plane — it had a big ridge running down the line \theta_A = \theta_B.  An analysis that assumes a prior without such a ridge is an analysis that assumes from the beginning that the drug has a significant effect with overwhelming probability.  So the fact that he concludes the drug has an effect with high probability is not at all surprising — it was encoded in his prior from the beginning!

The nicest way to analyze a situation like this from a Bayesian point of view is to compare two different models: one where the drug has no effect and one where it has some effect.    MacKay analyzes the second one.  Tamino goes on to analyze both cases and compare them.   He concludes that the probability of getting the observed data is 0.00096 under the null model (drug has no effect) and 0.00293 under the alternative model (drug has an effect).

How do you interpret these results?  The ratio of these two probabilities is about 3.  This ratio is sometimes called the Bayesian evidence ratio,  and it tells you how to modify your prior probability for the two models.  To be specific,

Posterior probability ratio = Prior probability ratio x evidence ratio.

For instance, suppose that before looking at the data you thought that there was a 1 in 10 chance that the drug would have an effect.  Then the prior probability ratio was (1/10) / (9/10), or 1/9.  After you look at the data, you “update” your prior probability ratio to get a posterior probability ratio of 1/9 x 3, or 1/3.  So after looking at the data, you now think there’s a 1/4 chance that the drug has an effect and a 3/4 chance that it doesn’t.

Of course, if you had a different prior probability, then you’d have a different posterior probability.  The data can’t tell you what to believe; it can just tell you how to update your previous beliefs.

As Tamino says,

Perhaps the best we can say is that the data enhance the likelihood that the treatment is effective, increasing the odds ratio by about a factor of 3. But, the odds ratio after this increase depends on the odds ratio before the increase €” which is exactly the prior we don't really have much information on!

People often make statement like this as if they’re pointing out a flaw in the Bayesian analysis, but this isn’t a bug in the Bayesian analysis — it’s a feature!  You shouldn’t expect the data to tell you the posterior probabilities in a way that’s independent of the prior probabilities.  That’s too much to ask.  Your final state of belief will be determined by both the data and your prior belief, and that’s the way it should be.

Incidentally, my research group’s most recent paper has to do with a problem very much like this situation:  we’re considering whether a particular data set favors a simple model, with no free parameters, or a more complicated one.  We compute Bayesian evidence ratios just like this, in order to tell you how you should update your probabilities for the two hypotheses as a result of the data.  But we can’t tell you which theory to believe — just how much your belief in one should go up or down as a result of the data.