Category Archives: LHC Background Info

The Importance and Challenges of “Open Data” at the Large Hadron Collider

A little while back I wrote a short post about some research that some colleagues and I did using “open data” from the Large Hadron Collider [LHC]. We used data made public by the CMS experimental collaboration — about 1% of their current data — to search for a new particle, using a couple of twists (as proposed over 10 years ago) on a standard technique.  (CMS is one of the two general-purpose particle detectors at the LHC; the other is called ATLAS.)  We had two motivations: (1) Even if we didn’t find a new particle, we wanted to prove that our search method was effective; and (2) we wanted to stress-test the CMS Open Data framework, to assure it really does provide all the information needed for a search for something unknown.

Recently I discussed (1), and today I want to address (2): to convey why open data from the LHC is useful but controversial, and why we felt it was important, as theoretical physicists (i.e. people who perform particle physics calculations, but do not build and run the actual experiments), to do something with it that is usually the purview of experimenters.

The Importance of Archiving Data

In many subfields of physics and astronomy, data from experiments is made public as a matter of routine. Usually this occurs after an substantial delay, to allow the experimenters who collected the data to analyze it first for major discoveries. That’s as it should be: the experimenters spent years of their lives proposing, building and testing the experiment, and they deserve an uninterrupted opportunity to investigate its data. To force them to release data immediately would create a terrible disincentive for anyone to do all the hard work!

Data from particle physics colliders, however, has not historically been made public. More worrying, it has rarely been archived in a form that is easy for others to use at a later date. I’m not the right person to tell you the history of this situation, but I can give you a sense for why this still happens today. Continue reading

Breaking a Little New Ground at the Large Hadron Collider

Today, a small but intrepid band of theoretical particle physicists (professor Jesse Thaler of MIT, postdocs Yotam Soreq and Wei Xue of CERN, Harvard Ph.D. student Cari Cesarotti, and myself) put out a paper that is unconventional in two senses. First, we looked for new particles at the Large Hadron Collider in a way that hasn’t been done before, at least in public. And second, we looked for new particles at the Large Hadron Collider in a way that hasn’t been done before, at least in public.

And no, there’s no error in the previous paragraph.

1) We used a small amount of actual data from the CMS experiment, even though we’re not ourselves members of the CMS experiment, to do a search for a new particle. Both ATLAS and CMS, the two large multipurpose experimental detectors at the Large Hadron Collider [LHC], have made a small fraction of their proton-proton collision data public, through a website called the CERN Open Data Portal. Some experts, including my co-authors Thaler, Xue and their colleagues, have used this data (and the simulations that accompany it) to do a variety of important studies involving known particles and their properties. [Here’s a blog post by Thaler concerning Open Data and its importance from his perspective.] But our new study is the first to look for signs of a new particle in this public data. While our chances of finding anything were low, we had a larger goal: to see whether Open Data could be used for such searches. We hope our paper provides some evidence that Open Data offers a reasonable path for preserving priceless LHC data, allowing it to be used as an archive by physicists of the post-LHC era.

2) Since only had a tiny fraction of CMS’s data was available to us, about 1% by some count, how could we have done anything useful compared to what the LHC experts have already done? Well, that’s why we examined the data in a slightly unconventional way (one of several methods that I’ve advocated for many years, but has not been used in any public study). Consequently it allowed us to explore some ground that no one had yet swept clean, and even have a tiny chance of an actual discovery! But the larger scientific goal, absent a discovery, was to prove the value of this unconventional strategy, in hopes that the experts at CMS and ATLAS will use it (and others like it) in future. Their chance of discovering something new, using their full data set, is vastly greater than ours ever was.

Now don’t all go rushing off to download and analyze terabytes of CMS Open Data; you’d better know what you’re getting into first. It’s worthwhile, but it’s not easy going. LHC data is extremely complicated, and until this project I’ve always been skeptical that it could be released in a form that anyone outside the experimental collaborations could use. Downloading the data and turning it into a manageable form is itself a major task. Then, while studying it, there are an enormous number of mistakes that you can make (and we made quite a few of them) and you’d better know how to make lots of cross-checks to find your mistakes (which, fortunately, we did know; we hope we found all of them!) The CMS personnel in charge of the Open Data project were enormously helpful to us, and we’re very grateful to them; but since the project is new, there were inevitable wrinkles which had to be worked around. And you’d better have some friends among the experimentalists who can give you advice when you get stuck, or point out aspects of your results that don’t look quite right. [Our thanks to them!]

All in all, this project took us two years! Well, honestly, it should have taken half that time — but it couldn’t have taken much less than that, with all we had to learn. So trying to use Open Data from an LHC experiment is not something you do in your idle free time.

Nevertheless, I feel it was worth it. At a personal level, I learned a great deal more about how experimental analyses are carried out at CMS, and by extension, at the LHC more generally. And more importantly, we were able to show what we’d hoped to show: that there are still tremendous opportunities for discovery at the LHC, through the use of (even slightly) unconventional model-independent analyses. It’s a big world to explore, and we took only a small step in the easiest direction, but perhaps our efforts will encourage others to take bigger and more challenging ones.

For those readers with greater interest in our work, I’ll put out more details in two blog posts over the next few days: one about what we looked for and how, and one about our views regarding the value of open data from the LHC, not only for our project but for the field of particle physics as a whole.

LHC Starts Collisions; and a Radio Interview Tonight

In the long and careful process of restarting the Large Hadron Collider [LHC] after its two-year nap for upgrades and repairs, another milestone has been reached: protons have once again collided inside the LHC’s experimental detectors (named ATLAS, CMS, LHCb and ALICE). This is good news, but don’t get excited yet. It’s just one small step. These are collisions at the lowest energy at which the LHC operates (450 GeV per proton, to be compared with the 4000 GeV per proton in 2012 and the 6500 GeV per proton they’ve already achieved in the last month, though in non-colliding beams.) Also the number of protons in the beams, and the number of collisions per second, is still very, very small compared to what will be needed. So discoveries are not imminent!  Yesterday’s milestone was just one of the many little tests that are made to assure that the LHC is properly set up and ready for the first full-energy collisions, which should start in about a month.

But since full-energy collisions are on the horizon, why not listen to a radio show about what the LHC will be doing after its restart is complete? Today (Wednesday May 6th), Virtually Speaking Science, on which I have appeared a couple of times before, will run a program at 5 pm Pacific time (8 pm Eastern). Science writer Alan Boyle will be interviewing me about the LHC’s plans for the next few months and the coming years. You can listen live, or listen later once they post it.  Here’s the link for the program.

More on Dark Matter and the Large Hadron Collider

As promised in my last post, I’ve now written the answer to the second of the three questions I posed about how the Large Hadron Collider [LHC] can search for dark matter.  You can read the answers to the first two questions here. The first question was about how scientists can possibly look for something that passes through a detector without leaving any trace!  The second question is how scientists can tell the difference between ordinary production of neutrinos — which also leave no trace — and production of something else. [The answer to the third question — how one could determine this “something else” really is what makes up dark matter — will be added to the article later this week.]

In the meantime, after Monday’s post, I got a number of interesting questions about dark matter, why most experts are confident it exists, etc.  There are many reasons to be confident; it’s not just one argument, but a set of interlocking arguments.  One of the most powerful comes from simulations of the universe’s history.  These simulations

  • start with what we think we know about the early universe from the cosmic microwave background [CMB], including the amount of ordinary and dark matter inferred from the CMB (assuming Einstein’s gravity theory is right), and also including the degree of non-uniformity of the local temperature and density;
  • and use equations for known physics, including Einstein’s gravity, the behavior of gas and dust when compressed and heated, the effects of various forms of electromagnetic radiation on matter, etc.

The output of the these simulations is a prediction for the universe today — and indeed, it roughly has the properties of the one we inhabit.

Here’s a video from the Illustris collaboration, which has done the most detailed simulation of the universe so far.  Note the age of the universe listed at the bottom as the video proceeds.  On the left side of the video you see dark matter.  It quickly clumps under the force of gravity, forming a wispy, filamentary structure with dense knots, which then becomes rather stable; moderately dense regions are blue, highly dense regions are pink.  On the right side is shown gas.  You see that after the dark matter structure begins to form, that structure attracts gas, also through gravity, which then forms galaxies (blue knots) around the dense knots of dark matter.  The galaxies then form black holes with energetic disks and jets, and stars, many of which explode.   These much more complicated astrophysical effects blow clouds of heated gas (red) into intergalactic space.

Meanwhile, the distribution of galaxies in the real universe, as measured by astronomers, is illustrated in this video from the Sloan Digital Sky Survey.   You can see by eye that the galaxies in our universe show a filamentary structure, with big nearly-empty spaces, and loose strings of galaxies ending in big clusters.  That’s consistent with what is seen in the Illustris simulation.

Now if you’d like to drop the dark matter idea, the question you have to ask is this: could the simulations still give a universe similar to ours if you took dark matter out and instead modified Einstein’s gravity somehow?  [Usually this type of change goes under the name of MOND.]

In the simulation, gravity causes the dark matter, which is “cold” (cosmo-speak for “made from objects traveling much slower than light speed”), to form filamentary structures that then serve as the seeds for gas to clump and form galaxies.  So if you want to take the dark matter out, and instead change gravity to explain other features that are normally explained by dark matter, you have a challenge.   You are in danger of not creating the filamentary structure seen in our universe.  Somehow your change in the equations for gravity has to cause the gas to form galaxies along filaments, and do so in the time allotted.  Otherwise it won’t lead to the type of universe that we actually live in.

Challenging, yes.  Challenging is not the same as impossible. But everyone one should understand that the arguments in favor of dark matter are by no means limited to the questions of how stars move in galaxies and how galaxies move in galaxy clusters.  Any implementation of MOND has to explain a lot of other things that, in most experts’ eyes, are efficiently taken care of by cold dark matter.

Dark Matter: How Could the Large Hadron Collider Discover It?

Dark Matter. Its existence is still not 100% certain, but if it exists, it is exceedingly dark, both in the usual sense — it doesn’t emit light or reflect light or scatter light — and in a more general sense — it doesn’t interact much, in any way, with ordinary stuff, like tables or floors or planets or  humans. So not only is it invisible (air is too, after all, so that’s not so remarkable), it’s actually extremely difficult to detect, even with the best scientific instruments. How difficult? We don’t even know, but certainly more difficult than neutrinos, the most elusive of the known particles. The only way we’ve been able to detect dark matter so far is through the pull it exerts via gravity, which is big only because there’s so much dark matter out there, and because it has slow but inexorable and remarkable effects on things that we can see, such as stars, interstellar gas, and even light itself.

About a week ago, the mainstream press was reporting, inaccurately, that the leading aim of the Large Hadron Collider [LHC], after its two-year upgrade, is to discover dark matter. [By the way, on Friday the LHC operators made the first beams with energy-per-proton of 6.5 TeV, a new record and a major milestone in the LHC’s restart.]  There are many problems with such a statement, as I commented in my last post, but let’s leave all that aside today… because it is true that the LHC can look for dark matter.   How?

When people suggest that the LHC can discover dark matter, they are implicitly assuming

  • that dark matter exists (very likely, but perhaps still with some loopholes),
  • that dark matter is made from particles (which isn’t established yet) and
  • that dark matter particles can be commonly produced by the LHC’s proton-proton collisions (which need not be the case).

You can question these assumptions, but let’s accept them for now.  The question for today is this: since dark matter barely interacts with ordinary matter, how can scientists at an LHC experiment like ATLAS or CMS, which is made from ordinary matter of course, have any hope of figuring out that they’ve made dark matter particles?  What would have to happen before we could see a BBC or New York Times headline that reads, “Large Hadron Collider Scientists Claim Discovery of Dark Matter”?

Well, to address this issue, I’m writing an article in three stages. Each stage answers one of the following questions:

  1. How can scientists working at ATLAS or CMS be confident that an LHC proton-proton collision has produced an undetected particle — whether this be simply a neutrino or something unfamiliar?
  2. How can ATLAS or CMS scientists tell whether they are making something new and Nobel-Prizeworthy, such as dark matter particles, as opposed to making neutrinos, which they do every day, many times a second?
  3. How can we be sure, if ATLAS or CMS discovers they are making undetected particles through a new and unknown process, that they are actually making dark matter particles?

My answer to the first question is finished; you can read it now if you like.  The second and third answers will be posted later during the week.

But if you’re impatient, here are highly compressed versions of the answers, in a form which is accurate, but admittedly not very clear or precise.

  1. Dark matter particles, like neutrinos, would not be observed directly. Instead their presence would be indirectly inferred, by observing the behavior of other particles that are produced alongside them.
  2. It is impossible to directly distinguish dark matter particles from neutrinos or from any other new, equally undetectable particle. But the equations used to describe the known elementary particles (the “Standard Model”) predict how often neutrinos are produced at the LHC. If the number of neutrino-like objects is larger that the predictions, that will mean something new is being produced.
  3. To confirm that dark matter is made from LHC’s new undetectable particles will require many steps and possibly many decades. Detailed study of LHC data can allow properties of the new particles to be inferred. Then, if other types of experiments (e.g. LUX or COGENT or Fermi) detect dark matter itself, they can check whether it shares the same properties as LHC’s new particles. Only then can we know if LHC discovered dark matter.

I realize these brief answers are cryptic at best, so if you want to learn more, please check out my new article.

How a Trigger Can Potentially Make or Break an LHC Discovery

Triggering is an essential part of the Large Hadron Collider [LHC]; there are so many collisions happening each second at the LHC, compared to the number that the experiments can afford to store for later study, that the data about most of the collisions (99.999%) have to be thrown away immediately, completely and permanently within a second after the collisions occur.  The automated filter, partly hardware and partly software, that is programmed to make the decision as to what to keep and what to discard is called “the trigger”.  This all sounds crazy, but it’s necessary, and it works.   Usually.

Let me give you one very simple example of how things can go wrong, and how the ATLAS and CMS experiments [the two general purpose experiments at the LHC] attempted to address the problem.  Before you read this, you may want to read my last post, which gives an overview of what I’ll be talking about in this one.

Click here to read the rest of the article…

Day 2 At CERN

Day 2 of my visit to CERN (host laboratory of the Large Hadron Collider [LHC]) was a pretty typical CERN day for me. Here’s a rough sketch of how it panned out:

  • 1000: after a few chores, arrived at CERN by tram. Worked on my ongoing research project #1. Answered an email about my ongoing research project #2.
  • 1100: attended a one hour talk, much of it historical, by Chris Quigg, one of the famous experts on “quarkonium” (atom-like objects made from a quark or anti-quark, generally referring specifically to charm and bottom quarks). Charmonium (charm quark/antiquark atoms) was discovered 40 years ago this week, in two very different experiments.
  • 1200: Started work on the talk that I am giving on the afternoon of Day 3 to some experimentalists who work at ATLAS. [ATLAS and CMS are the two general-purpose experimental detectors at the LHC; they were used to discover the Higgs particle.] It involves some new insights concerning the search for long-lived particles (hypothesized types of new particles that would typically decay only after having traveled a distance of at least a millimeter, and possibly a meter or more, before they decay to other particles.)
  • 1230: Working lunch with an experimentalist from ATLAS and another theorist, mainly discussing triggering, and other related issues, concerning long-lived particles. Learned a lot about the new opportunities that ATLAS will have starting in 2015.
  • 1400: In an extended discussion with two other theorists, got a partial answer to a subtle question that arose in my research project #2.
  • 1415: Sent an email to my collaborators on research project #2.
  • 1430: Back to work on my talk for Day 3. Reading some relevant papers, drawing some illustrations, etc.
  • 1600: Two-hour conversation over coffee with an experimentalist from CMS, yet again about triggering, regarding long-lived particles, exotic decays of the Higgs particle, and both at once. Learned a lot of important things about CMS’s plans for the near-term and medium-term future, as well as some of the subtle issues with collecting and analyzing data that are likely to arise in 2015, when the LHC begins running again.

[Why triggering, triggering, triggering? Because if you don’t collect the data in the first place, you can’t analyze it later!  We have to be working on triggering in 2014-2015 before the LHC takes data again in 2015-2018]

  • 1800: An hour to work on the talk again.
  • 1915: Skype conversation with two of my collaborators in research project #1, about a difficult challenge which had been troubling me for over a week. Subtle theoretical issues and heavy duty discussion, but worth it in the end; most of the issues look like they may be resolvable.
  • 2100: Noticed the time and that I hadn’t eaten dinner yet. Went to the CERN cafeteria and ate dinner while answering emails.
  • 2130: More work on the talk for Day 3.
  • 2230: Left CERN. Wrote blog post on the tram to the hotel.
  • 2300: Went back to work in my hotel room.

Day 1 was similarly busy and informative, but had the added feature that I hadn’t slept since the previous day. (I never seem to sleep on overnight flights.) Day 3 is likely to be as busy as Day 2. I’ll be leaving Geneva before dawn on Day 4, heading to a conference.

It’s a hectic schedule, but I’m learning many things!  And if I can help make these huge and crucial experiments more powerful, and give my colleagues a greater chance of a discovery and a reduced chance of missing one, it will all be worth it.

If It Holds Up, What Might BICEP2’s Discovery Mean?

Well, yesterday was quite a day, and I’m still sifting through the consequences.

First things first.  As with all major claims of discovery, considerable caution is advised until the BICEP2 measurement has been verified by some other experiment.   Moreover, even if the measurement is correct, one should not assume that the interpretation in terms of gravitational waves and inflation is correct; this requires more study and further confirmation.

The media is assuming BICEP2’s measurement is correct, and that the interpretation in terms of inflation is correct, but leading scientists are not so quick to rush to judgment, and are thinking things through carefully.  Scientists are cautious not just because they’re trained to be thoughtful and careful but also because they’ve seen many claims of discovery withdrawn or discredited; discoveries are made when humans go where no one has previously gone, with technology that no one has previously used — and surprises, mistakes, and misinterpretations happen often.

But in this post, I’m going to assume assume assume that BICEP2’s results are correct, or essentially correct, and are being correctly interpreted.  Let’s assume that [here’s a primer on yesterday’s result that defines these terms]

  • they really have detected “B-mode polarization” in the “CMB” [Cosmic Microwave Background, the photons (particles of light) that are the ancient, cool glow leftover from the Hot Big Bang]
  • that this B-mode polarization really is a sign of gravitational waves generated during a brief but dramatic period of cosmic inflation that immediately preceded the Hot Big Bang,

Then — IF BICEP2’s results were basically right and were being correctly interpreted concerning inflation — what would be the implications?

Well… Wow…  They’d really be quite amazing. Continue reading