Today, a small but intrepid band of theoretical particle physicists (professor Jesse Thaler of MIT, postdocs Yotam Soreq and Wei Xue of CERN, Harvard Ph.D. student Cari Cesarotti, and myself) put out a paper that is unconventional in two senses. First, we looked for new particles at the Large Hadron Collider in a way that hasn’t been done before, at least in public. And second, we looked for new particles at the Large Hadron Collider in a way that hasn’t been done before, at least in public.
And no, there’s no error in the previous paragraph.
1) We used a small amount of actual data from the CMS experiment, even though we’re not ourselves members of the CMS experiment, to do a search for a new particle. Both ATLAS and CMS, the two large multipurpose experimental detectors at the Large Hadron Collider [LHC], have made a small fraction of their proton-proton collision data public, through a website called the CERN Open Data Portal. Some experts, including my co-authors Thaler, Xue and their colleagues, have used this data (and the simulations that accompany it) to do a variety of important studies involving known particles and their properties. [Here’s a blog post by Thaler concerning Open Data and its importance from his perspective.] But our new study is the first to look for signs of a new particle in this public data. While our chances of finding anything were low, we had a larger goal: to see whether Open Data could be used for such searches. We hope our paper provides some evidence that Open Data offers a reasonable path for preserving priceless LHC data, allowing it to be used as an archive by physicists of the post-LHC era.
2) Since only had a tiny fraction of CMS’s data was available to us, about 1% by some count, how could we have done anything useful compared to what the LHC experts have already done? Well, that’s why we examined the data in a slightly unconventional way (one of several methods that I’ve advocated for many years, but has not been used in any public study). Consequently it allowed us to explore some ground that no one had yet swept clean, and even have a tiny chance of an actual discovery! But the larger scientific goal, absent a discovery, was to prove the value of this unconventional strategy, in hopes that the experts at CMS and ATLAS will use it (and others like it) in future. Their chance of discovering something new, using their full data set, is vastly greater than ours ever was.
Now don’t all go rushing off to download and analyze terabytes of CMS Open Data; you’d better know what you’re getting into first. It’s worthwhile, but it’s not easy going. LHC data is extremely complicated, and until this project I’ve always been skeptical that it could be released in a form that anyone outside the experimental collaborations could use. Downloading the data and turning it into a manageable form is itself a major task. Then, while studying it, there are an enormous number of mistakes that you can make (and we made quite a few of them) and you’d better know how to make lots of cross-checks to find your mistakes (which, fortunately, we did know; we hope we found all of them!) The CMS personnel in charge of the Open Data project were enormously helpful to us, and we’re very grateful to them; but since the project is new, there were inevitable wrinkles which had to be worked around. And you’d better have some friends among the experimentalists who can give you advice when you get stuck, or point out aspects of your results that don’t look quite right. [Our thanks to them!]
All in all, this project took us two years! Well, honestly, it should have taken half that time — but it couldn’t have taken much less than that, with all we had to learn. So trying to use Open Data from an LHC experiment is not something you do in your idle free time.
Nevertheless, I feel it was worth it. At a personal level, I learned a great deal more about how experimental analyses are carried out at CMS, and by extension, at the LHC more generally. And more importantly, we were able to show what we’d hoped to show: that there are still tremendous opportunities for discovery at the LHC, through the use of (even slightly) unconventional model-independent analyses. It’s a big world to explore, and we took only a small step in the easiest direction, but perhaps our efforts will encourage others to take bigger and more challenging ones.
For those readers with greater interest in our work, I’ll put out more details in two blog posts over the next few days: one about what we looked for and how, and one about our views regarding the value of open data from the LHC, not only for our project but for the field of particle physics as a whole.