You may have noticed that recently the Tevatron, the particle accelerator at Fermilab about an hour west of Chicago, keeps showing up in the news, while the Large Hadron Collider (LHC) has been noticeably absent. The CDF and DZero experiments, the two primary detectors at the Tevatron, have both reported multiple measurements that deviate from the prediction of the Standard Model of Particle Physics. In fact there have been far more such reports in the last twelve months than in any previous year that I can recall. What’s up?
Well, it would be great if this represented a flood of evidence that the Standard Model, under whose tyranny we particle physicists have been operating for over 30 years, were on the verge of collapse. But that still seems doubtful. For one thing, most of the signs of possible new physics are just hints, none of them very compelling individually … with perhaps one exception. (The one compelling case involves the “top-quark forward-backward asymmetry”, which deserves an article of its own.) For another, the various hints don’t yet really fit together into any obvious pattern (though many theorists are trying to find one.). And for a third, we are in the final phase of these experiments. The Tevatron is shutting down in September. It is a well-known factoid among particle physicists that hints of new physics — most of which turn out to be ephemeral — tend to appear more often as an experiment nears its end.
This tendency to spew exciting results in old age might seem disturbing — perhaps a sign of senility — and I have heard a variety of opinions from colleagues regarding its cause. Some suggestions are quite scandalous, on the verge of nasty. I’ve heard it said that it is all about funding — that the best way to convince a government agency to continue funding your experiment is to make a big deal about a small hint that you see in your data. Another suggestion is that experiments, as they end, often lose key personnel, who want to move on to a newer and better experiment, and this leads to a drop in quality control. And a third argument is that as an experiment sees its younger competitors gaining on it, and knows that obsolescence is a certainty, it stakes its claims to as many hints of discoveries as it can, on the argument that if the hints go away no one will remember, while if one of the hints remains, no one will forget.
Are any of these relevant at the Tevatron? Certainly not the first — the funding agencies have made the end of the Tevatron a certainty this September — and to the extent that the second and third might be operating, I doubt they are significant. My personal opinion is there’s a much simpler and far less tawdry reason for this late-in-the-day flurry of announcements. This reason takes account of a simple fact of statistics and a correspondingly simple bit of psychology and sociology.
We all know the probability of a flipped coin landing on one of its sides (“heads”) is 50%. Let’s call that the prediction of the Standard Model of Coin Physics. But suppose you wanted to actually check whether this is true experimentally? Or more interestingly — and more similarly to what is done at the Tevatron and LHC — suppose you wanted to see whether the coin you have in hand is a weird one, one that does not behave according to the 50% rule? How many times would you have to flip that coin before you would know that your coin deviated from the Standard Model?
This is a badly posed question. You never know something with absolute certainty. Knowledge doesn’t turn on like a light. It grows on you imperceptibly, like dawn on a cloudy day. The more evidence you have, the more confident you are. At some point you’re darn sure, but you can’t say precisely when it happens. So the question has to be: for a definite level of certainty, how many coin flips do you need? Obviously, the more certainty you want, the more flips you’ll require.
A couple of simple observations from statistics 101. Suppose you have a special coin (but you don’t know it yet) that comes up heads 100% of the time — and you simply want to know: “is this a standard coin?” After 5 flips you have 5 heads and no tails. Are you convinced yet? You shouldn’t be; the probability of 5 heads in a row is about 3% (1/32, to be precise) for a standard coin. How about 10 flips? The likelihood of a standard coin giving 10 straight heads is about 1/1000. 20 tries, and 20 heads, is one in a million for a standard coin. So you might say that twenty heads in a row makes a standard coin a darned-untenable hypothesis, while ten is unlikely but not something you’d bet your house on, and at five you shouldn’t even take 100 to 1 odds.
On the other hand, the less weird is your coin, the more data you need to prove it isn’t standard. If your weird coin only has a 51% chance of showing heads and 49% of showing tails, you’ll need at least a hundred thousand flips to see that it differs from a standard coin!
These are the basic rules. The smaller is the deviation from expectations that you want to be sensitive to, the more data you need. And if you see a hint of a deviation, the only way to become more certain about it is to acquire yet more data. But any given experiment only operates for a finite time! The government will only fund you to flip that coin a fixed number of times, or for a fixed number of years. When time runs out, there’s nothing you can do to obtain more certainty, or to increase your measurement’s sensitivity to small effects. You have to make the best of the data you have. So if and when you see a hint of a deviation, your reaction is going to be different, psychologically and scientifically, depending on whether you’ll soon have lots more data or whether you’re not expecting to get any more.
Let’s now go back three or four years. The CDF and DZero experiments had only about 1/10 of the data they have now [data at hadron colliders is always collected at a faster and faster rate as time goes by], and they knew they were going to get a lot more. So now imagine the conversation when one of their measurements showed a small deviation from expectations — a small hint of new physics. I wasn’t there to listen, but I can guess. The collaboration of hundreds of physicists had to decide, collectively, what to do next. Some in the experiment would surely have argued for publishing the result, saying “Why not announce a hint of something new? We’ll state clearly that it is just a hint.” But others would have raised concerns that they would rather not have their experiment cry wolf very often. They’d have said, “If we get in the habit of publishing every hint we see, we’ll lose our reputation, within and outside the scientific community, in no time.” And they would have argued, “Why rush? We’re going to get a lot more data very soon, and if nature really does deviate from our expectations, this hint will be a lot stronger with two or three times more data!”
It’s similar to wondering whether to call the newspapers after your coin gives 5 heads in a row. If you made a big deal out of it, and then it turned out that string of heads was just a statistical fluke, you might lose a lot of credibility. Instead, you could be a little cautious, and double-check your result by flipping the coin 5 more times, or maybe 15, until you’re really sure. If the coin is normal, those extra flips are very, very likely to dilute the unusual result of the first 5. If it’s a weird coin, the extra flips may turn out just as weird as the original ones, adding credence to the hint.
So now you can guess what can happen at a running experiment. Often, when a barely discernible deviation from a prediction shows up in the data, the experimental collaboration, instead of publishing it right away, “sits” on it, and waits for more data. A bias ensues; controversial results tend to see the light of day later than non-controversial ones, and many of the controversial hints disappear with more data and only appear in public once they are no longer controversial. (There is a counter-bias, of course — controversial results are more exciting. But in big scientific experiments, it seems the more cautious majority often outvotes the exuberant minority. There have been exceptions to this rule, and they often involve interesting stories of their own.)
In short, I personally suspect that there are no more hints of new physics at the end of an experiment than at the beginning or in the middle. There are always many hints in large new data sets. But the sociology within the experiment normally makes the bar to publicizing a deviating result very high. Toward the end of an experiment, with no more data coming, the argument for waiting for more data dissolves, and the bar to making any deviations public is lowered. And so we see many more controversial results than we did earlier.
This is where the Tevatron experiments are now. There is no point sitting on the evidence when no more data will be collected.
No more data will be collected internally, I should say. But that doesn’t mean there’s no way forward. If, say, the CDF experiment announces a deviating measurement, there is the opportunity to effectively double the data set by asking what its sibling DZero observes. If both experiments see a hint of the same phenomenon, that gives the hint considerable weight. (This is what has happened with the top-quark forward-backward asymmetry.) Conversely, if one experiment sees it but the other doesn’t, that strongly undermines the hint. (This is what happened with the excess seen in events with an electron or muon and two jets, irresponsibly reported in the press as the discovery of a new particle, even though the evidence allowed many other interpretations.) Since most hints are false alarms, we should expect to see the two experiments contradicting each other more often than not.
The LHC experiments ATLAS, CMS and LHCb can also contribute to the discussion, as soon as they have enough data to do the same (or similar) measurement. And we’re now at the point, here in July 2011, where they can often compete with their Tevatron counterparts. At the EuroPhysics conference this week, I am sure we will learn of measurements intended to confirm or refute the hints of new physics seen by CDF and DZero.
Meanwhile, the LHC experiments are now where the Tevatron experiments were many years ago. It is their turn to sit on their hints of new physics! So although we will hear about any can’t-miss-it discoveries, I expect the conference talks will show very few, if any, hints of new phenomena. Not that there aren’t such hints in the data! Surely there are, most of them false alarms. But we won’t see them at this conference; they won’t be made public. The experiments will sit on them for a bit longer, checking their methods and waiting for more data, just as the Tevatron experiments did. For unlike their aging Tevatron counterparts, the LHC experiments have the luxury of knowing their best data lie ahead.