Matt Strassler July 31, 2011
[UPDATE: This page is now out of date, because the hints referred to went away, as so often happens. However, the story told here is still very relevant, both because (as of December 2011) we’re still dealing with some of the same issues, and because (even after we’re through with this phase) I think it is historically interesting. So I will leave the article up, at least for now.]
My experimental colleagues at the LHC experiments ATLAS and CMS (and supported to a degree by data at the Tevatron experiments DZero and CDF) have recently reported a hint of a signal of a Higgs particle. But they, and I, and many others have emphasized that it is far too early to be sure that this hint is more than a mirage. My aim here is to explain to you, in a largely non-technical fashion, why the situation is so ambiguous. [As always, please feel free to comment on the clarity of this article! I will try to improve it over time.]
Signal and Background
First, let me define a couple of concepts. A “signal” is the thing you’re looking for. “Background” is everything else that resembles your signal and makes it difficult for you to find it. [Sometimes people call this “noise”, but noise is random, and background can include, in addition to noise, effects that are not random. Background is basically everything that gets in your way.] Think of trying to find a group of your friends in a crowd: your friends’ faces are the signal, everyone else’s faces (including, for example, tricky challenges, such as the identical twin of one of your friends who might also be in the crowd) is background. Or imagine looking for a four-leaf clover in a field of ordinary three-leaf clover. Your brain has to sift through all that background to find the signal.
Obviously, how hard it is to find the signal depends crucially on how much you know about the signal, how much you know about the background, and how much the signal and background resemble each other. Suppose you’re trying to find a group of your friends, and you think they might be in a large crowd. If you know your friends are all wearing red jackets, and people in the crowd are wearing all sorts of different things, your task is easy: look for the cluster of red jackets. If you see such a cluster, it is surely your friends; if you don’t see a cluster, your friends must be elsewhere.
But if lots of people in the crowd are wearing red jackets too, that’s going to make finding your friends a lot tougher. A little bunch of red jackets might be your friends, but it also might just be a random clustering of red-jacketed strangers. You’ll need to go up close and get more data.
And where it really gets hard is when you believe very few people in the crowd will be wearing red jackets, but you’re wrong. Then, when you see a cluster of red jackets, you will come confidently to the wrong conclusion that you’ve found your friends. Not knowing your background well is a very common cause of making you think you’ve seen a signal when you have not.
Another way to make a mistake is to be a bit colorblind. Colorblindness often makes it difficult to distinguish red and green; so a cluster of green jackets may appear to you to be your friends in their red jackets. If you know you’re colorblind, you’ll compensate for this somehow; but if you don’t know you’re colorblind, you might make a mistake. In short, when there’s something about the way you collect data that you don’t fully understand, it can cause you to draw a wrong conclusion.
Enough analogies. At the LHC and the Tevatron, several types of searches are going on simultaneously. There are easy ones that take a while, for which the signal is rather easy to identify — red jackets in a multicolored crowd, or even no crowd at all — but for which we don’t yet have enough data; and there are hard ones that happen faster — red jackets in a red-jacketed crowd — which are generating the current hints, but are much more likely to suffer from a subtlety or an outright error. That’s the most profound reason why it is far too early to draw confident conclusions. Fortunately the easy searches will come into their own soon enough, and they will be much less ambiguous. So within 6 to 12 months, we should know whether current hints are the real thing or not.
Sketches of the Easy Search Strategies
Let me tell you first about the easier searches. [I’ve explained in part how these easy searches work in one of my video clips, the third excerpt from my public talk at the Secret Science Club… though you should watch the first two excerpts before the third one, it will be much easier to follow.] There are two relatively simple ones.
Suppose the Standard Model Higgs particle is relatively light (explicitly, if its mass energy M c-squared [where c is the speed of light and M is the mass of the Higgs particle.] is between about 115 and 140 GeV or so.) [A GeV is explained here, but you don’t really need to know anything except that a proton has a mass energy of 0.938 GeV and an atom of tin has mass energy somewhat above 100 GeV] Then a clean unambiguous way to look for it is to use the fact that sometimes it decays (i.e., disintegrates) into two particles of light — “photons” — which are easy to detect and measure precisely. It’s rare, though; roughly 1 in 1000 of the disintegrating Higgs particles turn into two photons, and the number decreases sharply for a heavier Higgs. I’ve drawn the process schematically in the Fig. 1.
What the experiments do is detect the two photons and measure their energies. I’ve drawn a sketch of what the ATLAS or CMS detectors would observe in Fig 2. [At some point you may want to read about how these detectors work.]
Then [roughly speaking] if you add up their energies, you’ll find [thank you Emmy Noether] that the sum of their energies equals [thank you Albert Einstein] M c-squared, the mass energy corresponding to the mass M of the Higgs particle. [Technically, you have to do something slightly more complicated than adding the energies, but it’s just simple algebra, nothing fancy. Click here for details.] That’s the signal, two photons whose total energy adds up to M c-squared. The only problem is that we don’t know M. And there’s background, though that will turn out not to be such a big problem.
The background comes from the fact that there are other ways that two photons can be produced in a proton-proton collision at the LHC. (For example, two photons can be generated, without ever making a Higgs particle, when an up quark and an up antiquark strike each other.) This happens much, much more often than Higgs particles are made! And these processes look the same to the ATLAS and CMS detectors as do the signal.
But in this case, the sum of the energies of the two photons is nearly random.
So what the experimentalists do is collect all the collisions in which they detect two photons. For each collision, they calculate the sum of the energies of the two photons [technically, the invariant mass of the two photons]. Then they make a plot counting the number of events where the total energy was 100 GeV, 101 GeV, 102 GeV, etc. And if there is no Higgs particle decaying often enough to two photons, what they will find is a plot that looks like the left hand picture in Fig. 3… a smooth, undistinguished curve, showing that the number of collisions that make two photons with 115 GeV is almost the same as the number with 120, or 125, etc. [The blue curve is meant to symbolize what we expect the background to look like; the black dots are meant to represent data, which just from statistics will jump around a bit, being sometimes be a little above or a little below expectations.] But if there is a Higgs particle of mass energy 120 GeV or so decaying often enough to two photons, the plot will look something like the right hand picture. There is an excess of events at the mass energy of the Higgs particle! In addition to the random background, whose expected shape is the blue curve, there is a signal that is not at all random, and differs sharply from expectation. These are the red jackets in the sea of multi-colored clothing.
This is rather easy. You can see the signal looks completely different from the background, and you can pick it out by eye. Even if you made some mistakes in your guess for what the background should look like — if the background were perhaps a little larger, or a little more tilted, than you expected — you’d have no problem seeing the signal. And one more bonus: The location of the bump will be at an energy of M c-squared, thereby telling us what the mass M of the Higgs is! But unfortunately we will probably need 5 to 10 times more data then we have right now before this type of search will reveal something.
There’s an even easier search that involves a decay of a Higgs particle to four particles called charged leptons (specifically electrons, muons, and their antiparticles.) [In more detail, the Higgs decays to two Z particles, and each Z particle decays then to a charged lepton and its antiparticle. See Fig. 4.] [Experts: taus are leptons too, but are largely useless here because some of their energy is lost in their decay to neutrinos.]
These events can look something like Fig. 5, which shows what electrons and muons (and their antiparticles) look like in the detector. It is quite easy to see these particles and measure their properties very precisely.
The experimentalists do something very similar to what they do in the case of two photons described above. They select all collisions that look like Fig. 5, as well as all of the ones that have two electron-positron pairs or two muon-antimuon pairs. Then they plot the sum of the energies of the two charged leptons and two anti-leptons [again, technically, the invariant mass] and look for a peak over what is in this case a very small background. Since the signal is much larger than the background, this is the easiest of all the searches, in fact.
Unfortunately, getting two leptons and two anti-leptons from a Higgs particle is very rare! The Z particle often decays to other things. So this search will be effective if the Higgs particle is heavier than about 140 GeV, and the decay of Higgs particles to Z particles is common. Even so, at most 1 in 1000 Higgs particles decays in this way.
Fig. 6 shows what the experimentalists might see if there is a Higgs particle in the range where current hints suggest it might be found… and if they had several times more data than they have now (July 2011)! Note the background curve in blue and the imagined data in black differ much more than in Fig. 3. This is because the number of events is expected to be so tiny that for most possible masses the number of events observed is zero! And the number of events can only be 0, 1, 2, 3, etc. , so the data points jump around much more than in Fig. 3, where the number of events expected is quite large. But you can see how the Higgs signal sits much further above the curve than do any other data points. This is like looking for your friends with their red jackets in a nearly empty parking lot! If they are there, you can’t miss them!
A Sketch of a Difficult Search Strategy
The most powerful, but most challenging, search strategy — the one which has generated most of the current hints in the low-mass range, as well as helping exclude much of the high-mass range for the Higgs particle — involves a case where the Higgs particle decays to a W+ particle and a W- particle, or two Z particles, and from their decays emerge a charged lepton and a charged anti-lepton along with a neutrino and an anti-neutrino. One of the several ways this can happen is sketched in Fig. 7.
This is much harder because neutrinos and anti-neutrinos cannot be detected directly. Their presence can only be inferred from the fact that nothing observable balances the momentum of the charged lepton-antilepton pair, as in Fig. 8. We cannot measure their energies or directions of motion individually. And so we are not able to invoke the same method used in the two searches just described. We cannot play the game we played in Figs. 3 and 6; we lack too much information to compute the mass of the putative Higgs particle parent.
Instead the experimenters do something more difficult. They take all the collisions that have a charged lepton-antilepton pair and strong signs of invisible particles, as in the picture shown in Fig. 8, and measure the energies and directions of motion of the two observed particles. Then they make plots of various quantities — such as the sum of the energies of the lepton and anti-lepton [or rather, yet again, their invariant mass], or the angle between their direction of motion. Next, the experimentalists have to carefully determine what they expect — in other words, estimate the background. They use the fact that theorists with special expertise have made detailed, but somewhat uncertain, predictions for how often collisions like this occur in the Standard Model from non-Higgs processes. They use other data to check whether the theorists seem to have done it right. And they also have to use what they know about how their detectors work (no detector is perfect, nor can its imperfections be perfectly known.) Theorists also have predictions for the signal — an estimate of what a collection of collisions that make Higgs particles of a given mass will look like.
And then the experimentalists ask — does the data look more like our expected background, or more like the sum of the expected background and the expected signal?
In Fig. 9 is shown, on the left, what we would expect the data to look like if (a) there is no signal, only background, and (b) the estimate of background, the blue curve, is correct. And on the right is shown what we would expect the data to look like if (a) there is a signal, of size and shape shown in the red curve, that should be added to the background blue curve, and (b) the estimate of background is correct. In the latter case, the data will be above the expectation in the region where there is a signal.
The problem with this search is that determining the background is very, very difficult, and the signal does not look that different from the background. It’s looking for your friends in various shades of red, pink and orange jackets within a crowd of people, quite a few of whom (but you aren’t sure how many) also wear jackets of those colors.
The real data from the ATLAS and CMS experiments is shown in Fig. 10. Currently the data looks like the right-hand plot in Fig. 9, not the left-hand plot. I have marked the excesses — not very impressive by eye! but notice they are present over several nearby data points, which is what makes them significant. What do these excesses mean? Either (a) there is a signal, or (b) the background curve is wrong, or (c) it’s just a statistical fluke that will go away with more data. (ATLAS and CMS have made a number of other plots that look at other properties of the lepton and anti-lepton, and their features are somewhat similar; they see small excesses, but it is hard to know what they mean. But both experiments see them, which is tantalizing.)
As the number of excess events is still small, the difference of data from expectation might be a statistical accident. Over time, the possibility of a statistical accident will die away. But the question of whether the background is correctly understood may remain for quite some time. The determination of the backgrounds, which involves a complex combination of both the calculations by theorists and various measurements by experimentalists, might well be wrong, by as much as 20% or even more. That would be enough to muck up the measurement and create an apparent signal. And there’s no simple way, for this search strategy, to check what the experimentalists have done. There are many cross-checks to perform, some of which will require more data, and it isn’t clear how convincing they will become.
One more problem with this search strategy compared to the easy ones is that there is no way to get a precise measurement of M, the mass of the Higgs particle, even if it is there. The signals (red curves) in Fig. 10 are drawn for a particular value of the Higgs mass. If you imagine a Higgs with a larger mass, the shape of the signal would change gradually, but it would remain something of a blob, shifted over to the right. So it’s not so easy to see the difference between the signal from a Higgs of mass energy 140 GeV and one of 155 GeV. In contrast, the easy searches get you a nice sharp signal that looks very different for a Higgs of 140 GeV and one of 143 GeV (see Fig. 6). This is why the hints right now allow a Higgs particle over quite a wide range of masses; the shape of the excess in Fig. 10 could easily be consistent with a particle of mass energy 130 GeV or 145 GeV.
So where do things stand right now, at the end of July 2011? ATLAS and CMS at the LHC have ruled out much of the high-mass region above 160 GeV, and have seen a hint of a signal in the low-mass region below 160 GeV. The Tevatron experiments may also see a hint in the low-mass region. But all of them are currently forced to rely mainly on the very difficult search technique, where even the best scientists might make errors or fall into subtle traps… and they all might be making the same mistake or be caught in the same trap. Moreover, the data is limited enough right now that it is even possible that simple statistical accidents are at the heart of the current excitement.
Again, though, the good news is that over the next 6 to 12 months this will change. With more data, if the hints become more solid, statistical accident will no longer be an acceptable explanation. Also, it will become easier to check whether the more difficult search strategy is working properly. And perhaps most importantly, the easier search techniques will become much more powerful, eventually giving strong and unambiguous evidence for or against the presence of the Standard Model Higgs particle in the mass range that could generate the current hints. If the Standard Model Higgs particle is ruled out, the LHC experimentalists will have to redouble their efforts to look for other possible types of Higgs particles. If instead convincing evidence emerges in favor of a signal, then, after the champagne bottles are quaffed and the hangovers have diminished, the hard work of studying the new particle in detail — to check that it really is a Standard Model Higgs particle and not a look-alike — will begin.
Some more details
Experts will note that I have left out many interesting and in some cases important details. Over time I will supplement this article with some more plots and interpretation, or link it to new ones that do the same.