Breaking a Little New Ground at the Large Hadron Collider

Today, a small but intrepid band of theoretical particle physicists (professor Jesse Thaler of MIT, postdocs Yotam Soreq and Wei Xue of CERN, Harvard Ph.D. student Cari Cesarotti, and myself) put out a paper that is unconventional in two senses. First, we looked for new particles at the Large Hadron Collider in a way that hasn’t been done before, at least in public. And second, we looked for new particles at the Large Hadron Collider in a way that hasn’t been done before, at least in public.

And no, there’s no error in the previous paragraph.

1) We used a small amount of actual data from the CMS experiment, even though we’re not ourselves members of the CMS experiment, to do a search for a new particle. Both ATLAS and CMS, the two large multipurpose experimental detectors at the Large Hadron Collider [LHC], have made a small fraction of their proton-proton collision data public, through a website called the CERN Open Data Portal. Some experts, including my co-authors Thaler, Xue and their colleagues, have used this data (and the simulations that accompany it) to do a variety of important studies involving known particles and their properties. [Here’s a blog post by Thaler concerning Open Data and its importance from his perspective.] But our new study is the first to look for signs of a new particle in this public data. While our chances of finding anything were low, we had a larger goal: to see whether Open Data could be used for such searches. We hope our paper provides some evidence that Open Data offers a reasonable path for preserving priceless LHC data, allowing it to be used as an archive by physicists of the post-LHC era.

2) Since only had a tiny fraction of CMS’s data was available to us, about 1% by some count, how could we have done anything useful compared to what the LHC experts have already done? Well, that’s why we examined the data in a slightly unconventional way (one of several methods that I’ve advocated for many years, but has not been used in any public study). Consequently it allowed us to explore some ground that no one had yet swept clean, and even have a tiny chance of an actual discovery! But the larger scientific goal, absent a discovery, was to prove the value of this unconventional strategy, in hopes that the experts at CMS and ATLAS will use it (and others like it) in future. Their chance of discovering something new, using their full data set, is vastly greater than ours ever was.

Now don’t all go rushing off to download and analyze terabytes of CMS Open Data; you’d better know what you’re getting into first. It’s worthwhile, but it’s not easy going. LHC data is extremely complicated, and until this project I’ve always been skeptical that it could be released in a form that anyone outside the experimental collaborations could use. Downloading the data and turning it into a manageable form is itself a major task. Then, while studying it, there are an enormous number of mistakes that you can make (and we made quite a few of them) and you’d better know how to make lots of cross-checks to find your mistakes (which, fortunately, we did know; we hope we found all of them!) The CMS personnel in charge of the Open Data project were enormously helpful to us, and we’re very grateful to them; but since the project is new, there were inevitable wrinkles which had to be worked around. And you’d better have some friends among the experimentalists who can give you advice when you get stuck, or point out aspects of your results that don’t look quite right. [Our thanks to them!]

All in all, this project took us two years! Well, honestly, it should have taken half that time — but it couldn’t have taken much less than that, with all we had to learn. So trying to use Open Data from an LHC experiment is not something you do in your idle free time.

Nevertheless, I feel it was worth it. At a personal level, I learned a great deal more about how experimental analyses are carried out at CMS, and by extension, at the LHC more generally. And more importantly, we were able to show what we’d hoped to show: that there are still tremendous opportunities for discovery at the LHC, through the use of (even slightly) unconventional model-independent analyses. It’s a big world to explore, and we took only a small step in the easiest direction, but perhaps our efforts will encourage others to take bigger and more challenging ones.

For those readers with greater interest in our work, I’ll put out more details in two blog posts over the next few days: one about what we looked for and how, and one about our views regarding the value of open data from the LHC, not only for our project but for the field of particle physics as a whole.

37 responses to “Breaking a Little New Ground at the Large Hadron Collider

  1. Nice to have you back… please don’t leave such a long gap till your next posting! Always good to hear from you.

  2. Kudos to your team and you! It is great to hear about your work!

  3. Muhammad Adeel Ajaib

    Prof. Matt, if I am not mistaken, the two senses you wrote in the first paragraph seem to be the same.

  4. Hans van der Valk

    Dear Professor Matt.
    Yes I also fell into your trap. But in a trap you set years ago when you wrote for the first time about the difference between normal springs and quantum springs. That made me studying Elementary and Composite Particles followed by studying all the basics of Quantum Mechanics. I am 73 and addicted to science for all my life. My study is an every morning kitchen table study using documents from MIT, CIT, and many other universities.
    And … please promise me not to let these big gaps between your postings anymore.

    • Thank you for this message… it is very gratifying to have excited you about the science. And I will try to write more over the coming year than I did in 2018; finishing this paper may help with that.

  5. I strongly agree with you that lhc should shaRe their data with more public.

    I hope even with my laptop at home I can start to analyze the data in a way I want.

    they aRe using public money and so much money. and yet so far no physics beyond standard model.

    let us try the data.

    • I think you will need more than your laptop at home. I was only able to use my laptop after much larger computers were used to reduce the data down to a smaller subset.

  6. I’ll be following these posts…

    I understand the sheer mass of data, which is unsurprising. But is this really the sole nature of its inaccessibility? Also pretty familiar with “proprietary data” in research… sometimes just a good way to keep your cards close when there’s something to be made with what it might produce.

    • It’s not the amount of data, which is only a minor problem. It’s complexity. The LHC environment is constantly changing, and particles are produced in very messy collisions. The detectors are excellent, but the small errors that they make can easily look like signs of new particles to the novice, so you have to understand and remove all those small errors before you can even start. 99.9999% of the data is thrown away as it comes in, using a filter called the “trigger”; you must correct for its effects before you measure anything. And it goes on from there….

  7. Pingback: Breaking a Little New Ground at the Large Hadron Collider – I Fuckin' Love Science Teams Up With The Science Channel To Curate The Best Science Content On The Web | I Fuckin' Love Science

  8. Matt,
    With LS2 going on they should be able to raise the energy above 6499 GeV per beam, or 12998 GeV total. It was somewhat amusing that they did not want to have the total be 13 TeV. Sounds like some were superstitious.

  9. I always enjoy hearing from you Matt. You taught me everything I know about Field Theory. Your posts are at an excellent level for me. Susan J. Feingold Composer & sometime Physicist.

  10. Professor,
    Thanks for your post and public work.
    Correct me, if I am wrong, there exists math. theory of rare events, and you somehow use this theory. I am very glad you are back with us and inform us of your work. I wish you and your team discover a new particle using public data.
    Yours, bob-2

    • It’s not a theory of rare events, no. Theories in particle physics are much more specific than that, of the form: let’s imagine there are new particles X and Y with new forces A and B, here are the equations, here are the predictions for the rate at which X is created, and here is the fraction of them that would be observed at the LHC; is this possibility excluded by current data, or is there a sign of it in current data, or neither? It’s all very precise and concrete when you carry it out, with detailed calculations that need to be accurate to within a few percent.

  11. It sounds like the people at CERN are overwhelmed with the monumental task of sorting out through all the data and are yelling out for help, true? I think they need better management, assembling the “right” team instead of asking people “outside” the LHC community to find the errors first so it will save them time and effort. I bet they, CERN, have already created a matrix to map the errors as they come in from all sources. Maybe it will help to converge to the right paths?

    On a separate note, I think I’ve found the unified theory! 🙂 Stop laughing and hear me out.

    Space is all that is required to create this magnificent universe. Ah yes, but what is space? Space is the single field, that creates all the rest. It will tend to contract because that is the simplest form, a sphere. But it will expand if it stretches and it will stretch because it’s unstable, nothing is. It’s these oscillations within the field that will create the other fields. It’s these oscillations within the fundamental field (gravity?) that will create “knots”, trapped resonances and hence the particles that define the other fields. Yes, the fundamental field is gravity and gravity is just space. In this model, I do believe that the graviton is not necessary because a particle is not necessary, it’s the “curvature” of space that creates the forces that drive all other fields.

    Please, no profanity. 🙂

    Happy Valentine’s day.

  12. I’m glad you’re blogging again. Regarding the past/future findings of LHC check the link https://www.reddit.com/r/ScienceUncensored/comments/aqcuqu/can_big_science_be_too_big/egf4xi5

  13. /* It sounds like the people at CERN are overwhelmed with the monumental task of sorting out through all the data and are yelling out for help, true? */

    I’m not sure about it, they primarily want to justify the building new collider. If they’re looking for supersymmetry, they should focus to diphoton decay channel of low-dimensional (collinear) collisions at high luminosities.

    • Not exactly. The story really has nothing to do with that and is very complicated; you can bet they do not think they need help. Nor do I. They just need a push, in my opinion, to try some new strategies.

  14. Matt, it’s great to hear that you’re currently fully occupied at the cutting edge of particle physics, analysing data from the LHC. But meanwhile …..

    If you google ‘proton’, it’s the same old Wikipedia three-quark graphic, 7+ years after you seemed to have conclusively demolished that model in favour of the ‘pandemonium’ picture: gazillions of quarks and gluons accounting both for proton mass and the preponderance of very low energy jets recorded from proton collisions. Do you still stand by that theory? Why is it not more widely accepted? I understand you had to leave ‘Of ParticularSignificance’ incomplete due to personal and/or professional circumstances, but I think it’s still the best, clearest and most thorough resource for the public understanding of quantum physics. So am I right in thinking the science itself is in the doldrums, or is it just not filtering through to us interested amateurs?

    And now … I’m sure he’s not the only person questioning the very existence of quarks and gluons, but there’s a chap called John Duffield, with his own Topological approach, whose Quora answer to ‘What’s happening inside of a proton …?’ is calling you out personally! Quote: “See my Physics Detective articles on ‘what the proton is not’ and ‘the proton’ for details and references. But don’t tell Matt Strassler!”
    So there: now you’ve been told. Please don’t shoot the messenger!

    • it’s not “a theory”, it’s certainly not “my theory”, and the three quark picture of the proton was demolished by 1973, though not everyone understood it until the late 80s! It’s not the slightest bit controversial: See for example https://www.bnl.gov/newsroom/news.php?a=212163 . Or go to slide 4 of this one: https://slideplayer.com/slide/13025765/ If a proton only had quarks in it you’d never produce top quarks or Higgs bosons at the rate at which they are seen at the LHC. Wikipedia continues to propagate a wrong image, and so do lots of other non-experts.

      • Thanks very much for responding, Matt, and thanks for the two links – very helpful. Surely the problem is not so much that non-experts ‘propagate a wrong image’ but that too many experts (other than yourself!) popularise the three-quark model, presumably in the mistaken belief that the interested public can’t cope with anything more complicated. Just one example of many, from Stephen Hawking’s ‘The Grand Design’: “Protons and neutrons are each composed of three quarks.”
        I’m assuming you think the topological model, which seems to dispense with quarks and gluons altogether, is not even worth commenting on!

        • Actually, the topological model deserves lots of comment! Here’s the surprise: it’s equivalent to the quark/gluon picture!! But this is a highly advanced topic, not easy to explain. Essentially, one has to translate the math of quarks and gluons into an equivalent math of pions and other hadrons, using a poorly-understood technique called a “duality transformation”. And if you do that, the proton becomes a topological object built out of pions. The math behind this transformation seems to be beyond what mathematicians know, so our understanding, while clearly correct, is disappointingly vague. [I covered just a little of this vast subject in an unfortunately unfinished series of articles, https://profmattstrassler.com/2013/09/23/quantum-field-theory-string-theory-and-predictions/ and following posts. Maybe I’ll have time someday to finish them.]

          • Thanks for the link to your “QFT, ST and predictions” thread. I notice that in Parts 5 and 6 of that thread you began to discuss how the method of ‘imaginary’ simplification might elucidate the properties of pions and other lightweight mesons. I was hoping that might lead on to consideration of what still seems a great ‘missing link’ in QT, the Residual Strong Nuclear Force, binding protons to neutrons. I’ve read that pions are exchanged, but explanations are vague and/or inconsistent. Computer power has massively increased since 2013, so do you know of any progress in modelling this process? Or, as you candidly cautioned in ‘What holds nuclei together?’, is it still considered too complicated?

          • I’m actually not entirely sure what the status is on nuclei. I believe there’s been a lot of progress but I have not followed the studies closely enough to know the current situation. Maybe we can find a review talk on the subject that would explain where things are. As for “pion exchange”, this is equivalent (under the quark-hadron duality) to quark exchange between protons and neutrons. The computers can reproduce this phenomenon, but I’m not sure there’s any conceptual picture that comes out of the calculations; it may be that it remains a conceptual mess, and there’s no guarantee that there’s a simplifying view.

  15. Always good to hear from you Matt, your imparted wisdom has grown sparse of late and is greatly missed.

  16. Schröshire Cat

    Nice. Hopefully, you can post some more instructions on how to do that. I started working on primary datasets with ROOT a year ago, but found it difficult to figure out a) what data sets to use and b) determine what’s actually in them, c) what to look for and d) how to look for it.
    Needless to say, I am an amateur with some knowledge but with much more to learn. So, it would be interesting to learn how you physicists go about this.

    • I’m afraid that you basically need to be a near-expert in particle physics to use it, and I do not see a way around that. You should start by learning about the trigger system at ATLAS and CMS, because you cannot hope to understand what is and isn’t in the data if you don’t know about that. And you should learn about the ten or so most common processes at the LHC, if you don’t know them already. You also need to know how to use simulation data (which they provide) to correct the real data for detector effects. On top of that, the Open Data people haven’t yet made things user friendly, so even I don’t understand many of the things you’d need to do to carry our a complete analysis of anything other than muons. But maybe at some point we’ll write a tutorial of how to measure the Z boson cross-section at the LHC, and at that point the level of the challenge can become clearer to everyone.

      • Schröshire Cat

        Thanks Matt.
        The trigger system and the whole DAQ thing is actually where I am right now. A tutorial of some sort would definitely be helpful. I certainly can find all answers on my own, it just takes longer. So every help is appreciated.

        • I don’t think I would even know how to make a tutorial at this point. In any case I don’t think you need the DAQ info; that you can just accept. But a key question: if you look at our paper, can you understand why figures 1, 2 and 5 look the way they do, in detail? If you don’t understand the shapes of those plots, then you are not ready to download any data. In fact I would spend a lot of time with papers and talks from the LHC and try to make sure you understand all the plots. Maybe just focus on those that involve “dielectron” or “dimuon” [collectively “dilepton”] searches. Only when the fraction of plots that confuse you drops below 25% would I say you were ready to understand what real data passed through an LHC trigger should look like. In a similar vein, consider https://www.science20.com/quantum_diaries_survivor/blog/atlas_vs_cms_dimuon_resonances-75594: what are all the bumps, what’s between the bumps, why does the “between the bumps” stuff change slope rather than rise smoothly to the left, and so on.

  17. Michael John Sarnowski

    I’m looking forward to finding out your discovery.

    • 🙂 No discoveries. Not surprising, since we had only 1% of the data. The question is whether ATLAS and CMS, using the same method, could make a discovery in the full data set!

  18. It’s great to see you writing on this blog again, and however often you choose to do so, I’ll happily wait. Congratulations on the paper, and for getting through 2018.

  19. Pingback: A Broad Search for Fast Hidden Particles – I Fuckin' Love Science Teams Up With The Science Channel To Curate The Best Science Content On The Web | I Fuckin' Love Science