The Trigger: Discarding All But the Gold

37 Responses

Pingback: The Importance and Challenges of “Open Data” at the Large Hadron Collider – I Fuckin' Love Science Teams Up With The Science Channel To Curate The Best Science Content On The Web | I Fuckin' Love Science
Pingback: The Importance and Challenges of “Open Data” at the Large Hadron Collider – I Fuckin' Love Science Teams Up With The Science Channel To Curate The Best Science Content On The Web | I Fuckin' Love Science
Pingback: An Interesting Result from CMS, and its Implications – I Fuckin' Love Science Teams Up With The Science Channel To Curate The Best Science Content On The Web | I Fuckin' Love Science
Pingback: An Interesting Result from CMS, and its Implications – I Fuckin' Love Science Teams Up With The Science Channel To Curate The Best Science Content On The Web | I Fuckin' Love Science
Pingback: Hunting for Higgs | Daily Zooniverse
Adrian says:

June 29, 2015 at 6:07 AM

Hi Matt.

Very clear and interesting post as always.

Maybe all the computing problem presented here will dramatically change when the quantum computing will be mature enough to be used at LHC (or what we will use after its “dead”).
I looked for links to see if something like this is already in developing for LHC but I did found nothing. Do you know if such projects exists at LHC?
I could found only several interesting articles like this one (it worth an eye):
http://www.enterrasolutions.com/2013/11/quantum-computing-and-big-data.html

Thank you for all you are building here.

Loading...

Reply
Doc says:

April 20, 2015 at 5:49 PM

Kudos Matt. You have a real gift for creating such beautiful and helpful analogies in your writing. They help immensely!

Loading...

Reply
Pingback: Non-Standard-Model Higgs Particle Decays: What We Found | Of Particular Significance
Pingback: At a CMS/Theory Workshop in Princeton | Of Particular Significance
Pingback: Visiting the Host Lab of the Large Hadron Collider | Of Particular Significance
Pingback: Final Day of SEARCH 2013 | Of Particular Significance
E.Chaniotakis says:

August 12, 2013 at 9:46 AM

Thanks for the great post professor!
I have a question:
When measuring the trigger turn on one can do it two ways (leaving tag and probe out of the discussion):
1)Either measure the absolute efficiency using orthogonal triggers – e.g if you want to measure how many jets you have above 80GeV pT, you can utilize a muon trigger and measure the ratio of jets(>80)/jets(all).
2)Measure the relative efficiency with respect to a lower threshold trigger – e.g to measure the efficiency of triggering 80 GeV jets, you can utilize a 30 GeV jet trigger and measure the ratio jets(>80)/jets(>30)

Now the question is , what is the advantage of using the second method?
The only one that I have found is that method 1 may be biased. E.g if you use a muon trigger , some muons will be from a semileptonic b-decay. Therefore the jets you measure will not be inclusive but have a level of contamination from heavy flavor decays.
Is it correct? And is it the only ?

Thank you very much

Loading...

Reply
Pingback: Higgs Workshop in Princeton | Of Particular Significance
Calvin says:

February 21, 2013 at 10:31 AM

Hey would you mind letting me know which web host you’re utilizing? I’ve loaded your blog in 3 different browsers
and I must say this blog loads a lot faster then most.

Can you suggest a good internet hosting provider at a reasonable price?
Thanks, I appreciate it!

Loading...

Reply
Don D. says:

August 13, 2012 at 10:26 AM

Fascinating stuff! Thanks Matt Strassler for the post.
Some who want to know more about the triggers or the collider or about physics in general for that matter may want to view a few of the Summer school lectures at http://indico.cern.ch/scripts/SSLPdisplay.py?stdate=2012-07-02&nbweeks=7 .
My question is seeing that computer technology is still advancing and storage and computing power still getting faster and cheaper do you foresee the triggers being opened up a little more to store more events that may have been discarded under the current plan? Also when the LHC is finally up to full design spec and the energy is higher and the maximum number of bunches are colliding at the 25ns timing and the pileup problem even worse, is the physics of how each part of the detector works going to become the limiting factor on how much useful data can be extracted or is it still the IT and software that are the main bottleneck?
Thanks again for your interesting and informative articles.

Loading...

Reply
1. Matt Strassler says:
  
  August 13, 2012 at 10:49 AM
  
  I’m sure there will be advances implemented during the 2013-2014 shutdown, but the details are not known to me; you’d need to ask an expert within the experiments, of which there aren’t that many. I’m not even sure decisions have been made; with computers, it always pays to make the decision at almost the last minute, to benefit from the most recent technology. Your second question doesn’t really have an answer; even now, the physics of each part of the detector is a limiting factor that determines what the trigger can be asked to do, on top of the question of how many events the trigger can select. So there are hardware and software and IT issues now, and there will be later; the balance may change a bit but I don’t think it’s going to be a qualitative shift. It’s worth keeping in mind that things are not going to get that much worse: the machine is already operating within a factor of 2 of its maximum collision rate, maximum energy, and pile-up. [By the way, operating at 25 nanoseconds makes the pile-up situation *better*, not worse, because you spread the collisions out over twice as many bunch-crossings] The big challenge will be keeping enough Higgs events, and other low-energy processes that are difficult for the trigger system, when the collision energy goes from 8 TeV to 13 or 14.
  
  Loading...
  
  Reply
Tony says:

August 13, 2012 at 9:30 AM

That’s really interesting!
I have a couple of question about the software/hardware used to analyze the data. Do they use normal PCs to run the software? If that’s the case maybe something like Seti At Home would be interesting to be able to analyze more data using more people’s computers over internet. Is there any plan for that?
Do they use GPUs in some way? As they are much faster than normal processors to run math computing could be interesting, I think SETI software can use GPUs to analyze data.

Many thanks Professor

Loading...

Reply
1. Matt Strassler says:
  
  August 13, 2012 at 9:54 AM
  
  Regarding the trigger: I believe [Edited by host: what I said here earlier was wrong, they don’t use anything like ordinary PCs. See my further reply below.]
  
  The other bottleneck in the data (not described here but mentioned in http://profmattstrassler.com/articles-and-posts/lhcposts/triggering-advances-in-2012/data-parking-at-cms/ ) comes later, when reconstructing in full detail, with no time constraint, what happened in a collision. That too involves many computers and very complex software. I do not yet know why they cannot farm out that task to off-site computers belonging to the public, but I can try to find out.
  
  Loading...
  
  Reply
  1. Matt Strassler says:
    
    August 13, 2012 at 10:41 AM
    
    Regarding the High-Level Trigger (which is the only stage that could possibly use off-site computers) the extreme complexity and detailed needs of the system are described in section 3 of http://arxiv.org/ftp/arxiv/papers/0908/0908.1065.pdf . There’s just no way you’re going to be able to integrate off-site PCs of non-identical structures into this.
    
    Loading...
    
    Reply
  2. Tony says:
    
    August 13, 2012 at 11:19 AM
    
    Yes, I was thinking about off-site computing for the stage with no time constraints, but well, as I can see on a first quick read of the paper you sent, the system is amazingly complex indeed. That’s probably the reason why there’s no GPUs present, as they’re good for brute force numeric computation, but not so good when there’s so much complex data transfers.
    I will carefully read it later to try to understand the details.
    Many thanks Professor.
    
    Loading...
    
    Reply
Pingback: The Trigger and the Parking Lot | Of Particular Significance
Richard Goldhor says:

February 9, 2012 at 8:57 PM

Thanks for another wonderful, clear, article, Matt! I found myself wondering whether there was any way to increase the “yield” from each bunch-crossing. Of course, I assume that if the proton density in the bunch (do you call it “luminosity”?) was higher, the yield would be higher. But what I was really wondering was whether there has been any brainstorming (maybe over a beer or two) on ways to steer or tune or otherwise guide the protons in the colliding bunches into interesting collisions. (Another assumption, I guess, is that head-on collisions produce more interesting fireworks than off-center collisions. Correct?)

Loading...

Reply
1. Matt Strassler says:
  
  February 10, 2012 at 9:42 AM
  
  The problem is that we simply cannot control subatomic particles at the level of accuracy that would do what you suggest. Particles can be controlled at the micrometer scale (millionth of a meter, 1/30,000 of an inch) and below, even down to a few nanometers, but to try to get two protons to line up perfectly you’d have to do more than a million times better.
  
  For scale: imagine you are throwing two sacks worth of sand at each other. And now you want to get more of the particles of sand to hit each other head on. Not easy.
  
  Loading...
  
  Reply
Bob says:

January 29, 2012 at 7:33 PM

waw, what a clear explanation for the pile-up thing they kept talking about in the 13th december conference. thanks a lot professor

Loading...

Reply
1. Matt Strassler says:
  
  January 30, 2012 at 3:49 PM
  
  Thanks!
  
  Loading...
  
  Reply
Pingback: Por qué el 99,9999% de las colisiones del LHC se pierden para siempre « Francis (th)E mule Science's News
Pingback: Por qué el 99,9999% de las colisiones del LHC se pierden para siempre « Francis (th)E mule Science's News
Matt says:

November 15, 2011 at 2:24 PM

There is an insane amount of data being thrown away for the reasons you stated. I was wondering if there was any other possible use for this un-triggered data? For example, using the data to increase the number of collisions in the current machine or give some insight on how to improve on future particle colliders. Just thinking of uses for this useless data. Thanks for the articles.

Loading...

Reply
1. Matt Strassler says:
  
  November 18, 2011 at 7:07 AM
  
  It’s not an issue as to whether that data is or isn’t useful. It would be great if one could keep all the data on all the collisions, but there’s no practical way to do it. Fortunately, most of that data is useless for the LHC’s main goals, so there’s no harm done. But it would still be a lot better if one could keep it, because when you throw so much away, you still has to keep your fingers crossed that none of it is critical.
  
  Loading...
  
  Reply
xrcat says:

November 7, 2011 at 9:13 AM

How different “occasions” are extracted from single “bunch cross” data? How to determine is two jets belongs to same occasion or not?

Loading...

Reply
1. Matt Strassler says:
  
  November 7, 2011 at 7:37 PM
  
  The experimentalists knew they would face these conditions, so they designed the experiments to be able to separate different collisions from one another using precise measurements. Take a look at http://www.lhc-closer.es/img/subidas/3_9_1_2.png : It illustrates the ability of the ATLAS experiment to separate all of the charged particles coming from four separate proton-proton collision vertices; the particles from each collision have been drawn in the same color to make it easy for you to see this. The experiments can deal with 30-40 vertices at a time. A jet contains many charged particles, so you just have to look to see which collision vertex those particles point back to, and then you know which proton-proton collision made the jet. And if you see two jets, you can check whether they point back to the same vertex or not.
  
  Obviously nasty things happen sometimes in this environment. Sometimes two collision vertices are too close together to distinguish. And it is very hard to tell which vertex an electrically-neutral particle came from, so often photons or neutrons are assigned to the wrong collision. But the experimenters have a lot of clever techniques to cope with this… imperfect, but good enough for most measurements. I do worry, however, about certain specific measurements which are a lot harder in the presence of all these simultaneous collisions!
  
  Loading...
  
  Reply
Pawel says:

November 4, 2011 at 8:03 PM

So by average how many collisions are stored, let’s say per second? I tried to count based on what you have said and I got something close to 30/sec. Is this more or less fine? Still that’s a lot of data. How much data is produced and stored with one collision, I guess we are talking here about at least tens of gigabytes?

BTW: fantastic blog and website, I got science degree (in theoretical chemistry) 15 years ago but since then I never actually worked in science, my life went other ways, I was always interested in particle physics, Thanx to you I can at least feel up to date with latest proceedings and discoveries. And you have amazing teaching skills, I am maybe not a total layperson, but very close, and I can understand absolutely everything. My teachers back then at the university weren’t even close to that level I must say 🙂

Loading...

Reply
1. Matt Strassler says:
  
  November 5, 2011 at 5:05 PM
  
  Hmm — did I miss a zero in my text somewhere? The data storage rates are about 400 per second (but I don’t have the precise numbers, and ATLAS and CMS are slightly different.) It’s a few billion collisions stored per year [actually, as described at the end of the article, a few billion bunch crossings, each of which typically contains one interesting collision and a couple of dozen dull ones] . Note that the overall collision rate has been increasing by more than a factor of 10 during 2011 and will probably increase again in 2012 by a small additional factor.
  
  Each collision is much less data than what you suggested — in each collision only a small fraction of the detector is particularly active, with most of the detector elements just registering electronic noise, so in a sense there are a lot of close-to-zeroes, and so data compression can reduce the size by a lot. In the end I am told it is about 10 megabytes per collision [actually, again, per bunch crossing]. I don’t know as much about this as I should; maybe one of my experimental colleagues can comment, if they happen to see me floundering a bit here.
  
  And thanks for the kind words!
  
  Loading...
  
  Reply
Pam says:

November 4, 2011 at 10:01 AM

Thank you professor. My Physics I professor worked ( well, still does) at the LHC and we have seen several presentations on the topic so it’s so good to be -finally- able to understand a post 🙂
So the information from the collisions that were triggered is analyzed by scientists or by a computer program? How soon after?
Pam

Loading...

Reply
1. Matt Strassler says:
  
  November 4, 2011 at 10:23 AM
  
  There are two levels (at least) of analysis.
  
  The first (“reconstruction”) involves just figuring out what all the electronic signals mean in terms of what particles were produced, how much energy did they carry and where were they heading. A very good effort (though preliminary, and potentially subject to reconsideration) is made automatically by computer shortly after the data is taken. But this is just at the level of saying what happened in that particular collision.
  
  Searches for new phenomena (“analysis”) of course involve study of large classes of collisions, thousands to millions of them, typically. The decision of what to search for and how, the selection of the relevant subset of the data, and the actual study of the data is done by humans, aided of course by computers. Humans can always, at such a stage, revisit and override what the computers did at the level of “reconstruction”.
  
  Each such search is very labor intensive, requiring numerous cross-checks to avoid errors. It is not unusual for 10 to 20 people to be involved in a single search.
  
  By the way, questions from non-experts are encouraged. If you can’t follow a post because a couple of points aren’t clear, ask for clarification. I won’t always be able to provide it, but often I can help, either with a quick comment, a revision of the post, or a later article. And your question can help me make the site better down the line.
  
  Loading...
  
  Reply
Sergei Petrov says:

November 4, 2011 at 9:28 AM

One way of evaluating trigger procedures would be to choose at random to keep a small part of collision results that would have been discarded and see if anything interesting shows up over time. This will not be sufficient to prove that trigger procedures do not miss some rare and previously unknown phenomenon, but could work as low cost sanity check for some systemic error.

Loading...

Reply
1. Matt Strassler says:
  
  November 4, 2011 at 10:17 AM
  
  Indeed, this is standard procedure.
  
  Loading...
  
  Reply