For non-expert readers who want to dig a bit deeper. This is the first post of two, the second of which will appear in a day or two:
In my last post I described, for the general reader and without using anything more than elementary fractions, how we know that each type of quark comes in three “colors” — a name which refers not to something that you can see by eye, but rather to the three “versions” of strong nuclear charge. Strong nuclear charge is important because it determines the behavior of the strong nuclear force between objects, just as electric charge determines the electric forces between objects. For instance, elementary particles with no strong nuclear charge, such as electrons, W bosons and the like, aren’t affected by the strong nuclear force, just as electrically neutral elementary particles, such as neutrinos, are immune to the electric force.
But a big difference is that there’s only one form or “version” of electric charge: in the language of professional physicists, protons have +1 unit of this charge, electrons have -1 unit of it, a nucleus of helium has +2 units of it, etc. By contrast, the strong nuclear charge comes in three versions, which are sometimes referred to as “redness”, “blueness” and “greenness” (because of a vague but highly imprecise analogue with the inner workings of the human eye). These versions of the charge combine in novel ways we don’t see in the electric context, and this plays a major role in the protons and neutrons found in every atom. It’s the math that lies behind this that I want to explain today; we’ll only need a little bit of trigonometry and complex numbers, though we’ll also need some careful reasoning.
The Intricacies of “Color”
At any time, a particular quark must [roughly] have either redness +1 or blueness +1 or greenness +1 (and is said to be red, blue or green); any anti-quark has redness, blueness or greenness -1 (and is said to be either anti-red, anti-blue or anti-green). Gluons, unlike photons which are electrically neutral, are not neutral under the strong nuclear force: they carry a color and an anti-color (i.e. a +1 charge and a -1 charge under two of the colors). The details of gluons can easily become confusing, so I’m going to save them til the next post. But this fact implies the gluon field, in which the gluons are ripples, also has color and anti-color.
This color and anti-color of the gluon field has a big impact on the quarks. It means that just by interacting with the gluon field around it, a red quark can turn green, or blue, over and over and over again at an extremely high rate. This makes it impossible to say what color it is. All we can really say safely is that quarks have color (i.e., charge +1 under one of the three strong nuclear charges) and anti-quarks have anti-color (i.e. charge -1 under one of the three) but we cannot say which color at any given time. Metaphorically speaking, as I suggested last time, a quark is almost like a light bulb that is always lit but whose color flickers randomly, continuously and rapidly between red, green and blue. There’s no point in assigning it a definite color.
Meanwhile, to make matters worse, hadrons, the particles in which quarks, anti-quarks and gluons are found in nature, never have any color at all; they are always neutral under the strong nuclear force, because the “colors” of all the particles inside cancel. That’s not so unfamiliar: an atom’s particles have total electric charge zero, too. But it is possible to ionize an atom (for instance by removing an electron, leaving the remainder with a net charge) allowing us to study objects with electric charge in a simple way. It is not, however, possible to color-ionize a hadron; the strong force is simply too strong. Objects with color can never be isolated and studied in detail.
So we can’t ever easily observe objects with color, and even if we did, it would be changing all the time anyway. Clearly, making sense of color is not going to be as simple as making sense of electric charge.
But here’s a fact that we could hope to understand. The most familiar hadrons seen in nature are (Figure 1)
- baryons, which appear at first to be made from three quarks [examples are protons, neutrons, and Lambdas]
- anti-baryons, which appear at first to be made from three anti-quarks [examples are anti-protons, anti-neutrons, and anti-Lambdas], or
- mesons, which appear at first to be made from a quark and an anti-quark [examples are pions, kaons and rho mesons]
Why these combinations and not others? What does this tell us about the strong nuclear force and the three versions of strong nuclear charge? When it was first proposed that the strong nuclear force might have three types of charge, it was believed that the proton is simply made from two up quarks and a down quark, and perhaps nothing else. This is quite different from the modern picture, where the hadrons also have gluons and additional quark-antiquark pairs.
But the baby steps that were made in those days got the basic math of color right. That’s what I want to explain today.
Math of Two Ordinary Dimensions
The math of the three colors is similar to the math of the three directions of space. So let’s start with the math of two-dimensional space… the math of lines on a plane. That will put us on the right track.
If we take a sheet of paper, we can draw coordinates on it however we want. Cartesian coordinates — the usual x and y coordinates, with axes that are perpendicular to one another — are often convenient. But Cartesian coordinates on paper can be chosen in any orientation we like; the x axis could point eastward and the y axis northward, but it’s just as good to have x point northwestward and y point southwestward. More generally, I could rotate my x and y axes however I like; as long as I rotate them together, they will still make a good Cartesian coordinate system.
Let’s now draw a line on this piece of paper. The line stretches a certain extent Lx in the x direction and a certain extent Ly in the y direction; see Fig. 2 left. But Lx and Ly depend on my choice for the x and y axes; for example, if I chose my x axis to lie right along my chosen line, then Ly would be zero. If instead I choose my y axis to lie along the line, then it’s Lx that will be zero. By rotating the axes, as in Fig. 2 center, I can change both Lx nor Ly. For the same reason, neither Lx nor Ly will stay the same if I rotate the line itself (keeping the coordinate axes fixed), as in Fig. 2 right. In technical language, neither Lx nor Ly is rotationally-invariant.
But clearly there is something rotationally-invariant about the line: its length, “L”, which I could measure by placing a ruler along the line, a process that doesn’t care about coordinates at all. Nevertheless, using the Pythagorean theorem, I can relate L to Lx and Ly, as in Figure 2.
- L2 = Lx2 + Ly2
The fact that this statement must remain true even if I rotate the axes or I rotate the line is a remarkable limitation on how Lx and Ly can change.
It is to capture the math of this miracle that trigonometry was invented; Lx can be written as L (cos θ), where θ is the angle between the line and the x axis; similarly Ly can be written as L (sin θ). The fact that cos2θ + sin2θ = 1 for any and all possible θ then assures that no matter how we rotate the line (or the axes), and thus change θ, the length L remains the same. Lx and Ly depend on our coordinates and so are not rotationally invariant, but L itself is rotationally invariant.
This is not the only rotationally-invariant quantity arising in the context of two-dimensional space. If I have two lines, and then I rotate my axes, or rotate both lines together, the angle between the lines won’t change; it’s something I can measure with a geometric compass, and it doesn’t care how my chosen coordinates on the space are laid out.
Still, if I want to measure the angle using coordinates, there’s a famous way to do it. For simplicity, let’s first imagine that both our lines have length 1. Then if the angle between any two lines is φ12 (Figure 3 left), its cosine is just the dot product: if Lx1 and Ly1 are the coordinate lengths of the first line, and Lx2 and Ly2 are the same for the second line, then
- cos φ12 = Lx1 Lx2 + Ly1 Ly2 [dot product for lines of length one]
More generally, for lines of lengths L1 and L2,
- cos φ12 = (Lx1 Lx2 + Ly1 Ly2) / (L1 L2) [general dot product]
There’s yet another rotationally invariant quantity in two dimensions. If I have two lines, I can view them as the edges of a parallelogram (Figure 3 right), and the area of that parallelogram can’t depend on my coordinates or how I rotate the pair of lines. I can compute it using the coordinates using the two-dimensional cross-product (without assuming the lines have length 1):
- area = | Lx1 Ly2 – Ly1 Lx2 | [cross product]
where the absolute value around the right hand side assures the area is a positive number. (As a check: if the parallelogram is a rectangle oriented along the x and y axes, then Lx1 = L1, Lx2 = 0, Ly1=0, Ly2=L2, and the formula tells us the rectangle then has area L1 L2, which is correct.)
What we have just done is understand how we create quantities that are invariant under the rotations of two ordinary coordinates. These rotations form a set, or rather a more structured set called a “group”, that goes by the name SO(2).
Math of three ordinary dimensions
Now let’s move to SO(3): the rotations of three ordinary coordinates. This will bring us much closer to the math of color.
The generalization of lengths and of dot products to three dimensions (x,y,z) is completely straightforward; there’s a three-dimensional generalization of Pythagoras’s theorem and of the usual rules of trigonometry, whose details we don’t need here. The effect is that if I have one line, its length is
- L2 = Lx2 + Ly2 + Lz2
and the angle between two lines of length L1 and L2 is again given by a dot product.
- cos φ12 = (Lx1 Lx2 + Ly1 Ly2 + Lz1 Lz2) / (L1 L2)
What generalizes the cross product? Being now in three dimensions, we will focus not on the area of an object whose edges are formed from two lines but instead on the volume of an object whose edges are formed by three lines, as in Figure 4 — often called a parallelpiped. This volume is given by a triple product: it can be viewed as the dot product of the first line with the cross-product of the second and third, or as the dot product of the third with the cross-product of the first and second, and so on. When the dust settles:
- Volume = | Lx1 Ly2 Lz3 – Lx1 Lz2 Ly3 + Ly1 Lz2 Lx3 – Ly1 Lx2 Lz3 + Lz1 Lx2 Ly3 – Lz1 Ly2 Lx3 |
Now this is already quite remarkable. Looking back at Figure 1, mesons seem somewhat analogous to dot products, baryons to triple products. Could there be a connection? Yes, there could; but ordinary dimensions are not enough.
Math of three complex dimensions
Let’s now imagine that instead of x, y, z being real numbers, we allowed them each to be complex numbers. This space of three complex dimensions is no longer one we can draw; we have to rely mainly on math to understand it.
Keep in mind, also, that these complex dimensions are not to be confused with the dimensions of empty space that we actually live in. They are just useful for keeping track of math, and not a concrete part of physical space, in which you and I and other real objects move around and can bump into one another. Only the strong nuclear charges of particles will (later) be associated with this space and change within it; the particles themselves will move around, as always, in our ordinary and familiar three-dimensional space.
Back to the math. With three complex instead of ordinary coordinates, there are now even more rotations than before. Not only could we rotate the x and y axes into each other as we ordinarily do, we could also multiply all the coordinates by a complex phase, such as the basic imaginary number ℑ, the square root of -1. This larger set of rotations constitutes the group SU(3).
What’s really new about having complex coordinates is that we now have lines and anti-lines; lines have coordinates, and anti-lines have complex conjugate coordinates. The complex conjugate of a line is an anti-line, and vice versa. This is important, because the length of a line can no longer be given by
- L2 = Lx2 + Ly2 + Lz2
because if I multiple x, y, and z by ℑ, then Lx → ℑ Lx, and Lx2 → – Lx2, and the same for Ly and Lz, with the effect that L2 becomes negative, and L becomes imaginary! A length defined this way would not be rotationally-invariant, or even meaningful.
Instead, the right formula for a rotationally-invariant length combines a line with the anti-line that is its complex conjugate:
- L2 = Lx Lx* + Ly Ly* + Lz Lz*
where Lx*, Ly*, Lz* are the coordinates of the anti-line that is the complex conjugate of the original line (specifically, Lx* is the complex conjugate of Lx, etc.) This length won’t change if we do a phase rotation; for instance, if we multiply Lx, Ly, Lz by ℑ, then Lx*, Ly*, Lz* are multipled by –ℑ, and since (ℑ) times (-ℑ) = – ℑ2 = +1, that leaves L unchanged.
Similarly, we can no longer take a dot product between two lines. We can only take it between a line and an anti-line. If a line has coordinates Lx1, Ly1, Lz1 and and an anti-line has coordinates Lx2*, Ly2*, Lz2*, then a rotationally invariant quantity is
- Lx1 Lx2* + Ly1 Ly2* + Lz1 Lz2*
Although I can’t illustrate this in 3 complex dimensions, I can illustrate it in 1 complex dimension — a single complex plane, shown in Figure 5. If x1 is a complex number, and x2* is the complex conjugate of a complex number x2, then x1 x2 is not invariant under rotation of the complex plane (i.e. rotation of all complex numbers by the same phase,) but x1 x2* is invariant because the phase cancels out.
For the triple product, however, it turns out we need either three lines or three anti-lines. The formula is the same as I quoted for SO(3):
- | Lx1 Ly2 Lz3 – Lx1 Lz2 Ly3 + Ly1 Lz2 Lx3 – Ly1 Lx2 Lz3 + Lz1 Lx2 Ly3 – Lz1 Ly2 Lx3 |
That this quantity is rotationally invariant isn’t obvious, but it is true, for the same reason it was true for three ordinary dimensions. The complex conjugate formula is invariant also. It’s not true for two lines and an anti-line.
So in SU(3), as in SO(3), we have dot products and triple products, but now
- dot products can be formed between a line and an anti-line, while
- triple products can be formed between three lines or between three anti-lines.
This now seems strikingly suggestive of mesons, baryons and anti-baryons — see Figure 1 — and the question is how to bring this math to bear on the actual physics.
The Mesons, Baryons and Anti-Baryons of the 1960s
The three complex dimensions x,y,z we’ve just encountered should be identified, in the context of the strong nuclear force, as redness, greenness and blueness. A quark, therefore, is a like a line in this space: perhaps it points along the redness direction (making it “red”), or along the greenness direction, or perhaps it points in a general direction (making it a combination of blue and red, or perhaps of all three colors.). To specify which direction it points in requires us to choose coordinates — to define what we mean by “redness” and “blueness” and “greenness” — and that’s arbitrary, so we can’t meaningfully say that a quark is red; we could change coordinates and make it half-green/half-blue. However, we can always say that a quark has color: it corresponds to a line in this three-complex-dimensional space, and although the line’s coordinates aren’t rotationally invariant, the fact that it has non-zero length most certainly is.
Meanwhile, if we take a quark and an anti-quark, we can create something from them which is truly colorless: it is completely independent of any choice of our color coordinates. A naive meson consists of a quark and an antiquark which have been combined using the dot product: with a quark 1 with red, green and blue coordinates, and an antiquark 2 with similar anti-coordinates, the following combination is “color-less”:
Physically speaking, whatever redness the quark has is balanced by that of the anti-quark, whatever greenness it has is similarly balanced, and so for blueness. This remains true if we redefine what we mean by redness, greenness and blueness. It also remains true as the quark and anti-quark interact with each other; the quark might change from red to green, but when that happens the anti-quark will change from anti-red to anti-green. This kind of synchrony is essential to assure the meson is always colorless.
In the same way, a naive baryon is made from three quarks combined using the triple product:
In this case the combined redness, greenness and blueness of the three quarks is colorless, and is independent of how we choose to define the “colors”, as long as all six terms in the above expression are present with the precise choice of plus and minus signs. (An anti-baryon is defined analogously.) What this means is that no two of the quarks in the baryon ever have the same color; and if you know the colors of two of the quarks, the third’s color is automatically determined.
We can now, naively, explain Figure 1 as representing these expressions, shown schematically in Figure 6.
If you learned linear algebra, you will recognize the meson as the product of a quark, written as a three-component vector, with an anti-quark, a three-component conjugate vector; and you may recognize the baryon as the determinant of a 3×3 matrix where the columns are the quarks and the rows are their colors. See Figure 7. These are the simplest SU(3)-rotationally-invariant objects that can be constructed from vectors and conjugate vectors.
But these naive baryons, made from three quarks and nothing else, aren’t the real thing. How do we go from here to real baryons — real protons, with three quarks along with a horde of gluons and quark-antiquark pairs? We only need one small bit of additional math… for next time.