Before completing this post, I need to acknowledge that my goal in writing about modern physics was to create a milieu for more talking about Western Zen. However, as I’ve proceeded, the goal has somewhat changed. I want you, as a reader, to become, if you aren’t already, a physics buff, much in the way I became a history buff after finding history incredibly boring and hateful throughout high school and college. The apotheosis of my history disenchantment came at Stanford in a course taught by a highly regarded historian. The course was entitled “The High Middle Ages” and I actually took it as an elective thinking that it was likely to be fascinating. It was only gradually over the years that I realized that history at its best although based on factual evidence, consists of stories full of meaning, significance and human interest. Turning back to physics, I note that even after more than a hundred years of revolution, physics still suffers a hangover from 300 years of its classical period in which it was characterized by a supposedly passionless objectivity and a mundane view of reality. In fact, modern physics can be imagined as a scientific fantasy, a far-flung poetic construction from which equations can be deduced and the fantasy brought back to earth in experiments and in the devices of our age. When I use the word “fantasy” I do not mean to suggest any lack of rigorous or critical thinking in science. I do want to imply a new expansion of what science is about, a new awareness, hinting at a “reality” deeper than what we have ever imagined in the past. However, to me even more significant than a new reality is the fact that the Quantum Revolution showed that physics can never be considered absolute. The latest and greatest theories are always subject to a revolution which undermines the metaphysics underlying the theory. Who knows what the next revolution will bring? Judging from our understanding of the physics of our age, a new revolution will not change the feeling that we are living in a universe which is an unimaginable miracle.
In what follows I’ve included formulas and mathematics whose significance can be easily be talked about without going into the gory details. The hope is that these will be helpful in clarifying the excitement of physics and the metaphysical ideas lying behind. Of course, the condensed treatment here can be further explicated in the books I mention and in Wikipedia.
My last post, about the massive revolution in physics of the early 20th century, ended by describing the situation in early 1925 when it became abundantly clear in the words of Max Jammer (Jammer, p 196) that physics of the atom was “a lamentable hodgepodge of hypotheses, principles, theorems, and computational recipes rather than a logical consistent theory.” Metaphysically, physicists clung to classical ideas such as particles whose motion consisted of trajectories governed by differential equations and waves as material substances spread out in space and governed by partial differential equations. Clearly these ideas were logically inconsistent with experimental results, but the deep classical metaphysics, refined over 300 years could not be abandoned until there was a consistent theory which allowed something new and different.
Werner Heisenberg, born Dec 5, 1901 was 23 years old in the summer of 1925. He had been a brilliant student at Munich studying with Arnold Sommerfeld, had recently moved to Göttingen, a citadel of math and physics, and had made the acquaintance of Bohr in Copenhagen where he became totally enthralled with doing something about the quantum mess. He noted that the electron orbits of the current theory were purely theoretical constructs and could not be directly observed. Experiments could measure the wavelengths and intensity of the light atoms gave off, so following the Zeitgeist of the times as expounded by Mach and Einstein, Heisenberg decided to try make a direct theory of atomic radiation. One of the ideas of the old quantum theory that Heisenberg used was Bohr’s “Correspondence” principle which notes that as electron orbits become large along with their quantum numbers, quantum results should merge with the classical. Classical physics failed only when things became small enough that Planck’s constant h became significant. Bohr had used this idea in obtaining his formula for the hydrogen atom’s energy levels. In various “old quantum” results the Correspondence Principle was always used, but in different, creative ways for each situation. Heisenberg managed to incorporate it into his ultimate vector-matrix construction once and for all. Heisenberg’s first paper in the Fall of 1925 was jumped on by him and many others and developed into a coherent theory. The new results eliminated many slight discrepancies between theory and experiment, but more important, showed great promise during the last half of 1925 of becoming an actual logical theory.
In January, 1926, Erwin Schrödinger published his first great paper on wave mechanics. Schrödinger, working from classical mechanics, but following de Broglie’s idea of “matter waves”, and using the Correspondence Principle, came up with a wave theory of particle motion, a partial differential equation which could be solved for many systems such as the hydrogen atom, and which soon duplicated Heisenberg’s new results. Within a couple of months Schrödinger closed down a developing controversy by showing that his and Heisenberg’s approaches, though based on seemingly radically opposed ideas, were, in fact, mathematically isomorphic. Meanwhile starting in early 1926, PAM Dirac introduced an abstract algebraic operator approach that went deeper than either Heisenberg or Schrödinger. A significant aspect of Dirac’s genius was his ability to cut through mathematical clutter to a simpler expression of things. I will dare here to be specific about what I’ll call THE fundamental quantum result, hoping that the simplicity of Dirac’s notation will enable those of you without a background in advanced undergraduate mathematics to get some of the feel and flavor of QM.
In ordinary algebra a new level of mathematical abstraction is reached by using letters such as x,y,z or a,b,c to stand for specific numbers, numbers such as 1,2,3 or 3.1416. Numbers, if you think about it, are already somewhat abstract entities. If one has two apples and one orange, one has 3 objects and the “3” doesn’t care that you’re mixing apples and oranges. With algebra, If I use x to stand for a number, the “x” doesn’t care that I don’t know the number it stands for. In Dirac’s abstract scheme what he calls c-numbers are simply symbols of the ordinary algebra that one studies in high school. Along with the c-numbers (classic numbers) Dirac introduces q-numbers (quantum numbers) which are algebraic symbols that behave somewhat differently than those of ordinary algebra. Two of the most important q-numbers are p and s, where p stands for the momentum of a moving particle, mv, mass times velocity in classical physics, and s stands for the position of the particle in space. (I’ve used s instead of the usual q for position to try avoid a confusion with the q of q-number.) Taken as q-numbers, p and s satisfy
ps – sp = h/2πi
which I’ll call the Fundamental Quantum Result in which h is Planck’s constant and i the square root of -1. Actually, Dirac, observing that in most formulas or equations involving h, it occurs as h/2π, defined what is now called h bar or h slash using the symbol ħ = h/2π for the “reduced” Planck constant. If one reads about QM elsewhere (perhaps in Wikipedia) one will see ħ almost universally used. Rather than the way I’ve written the FQR above, it will appear as something like
pq – qp = ħ/i
where I’ve restored the usual q for position. What this expression is saying is that in the new QM if one multiplies something first by position q and then by momentum p, the result is different from the multiplications done in the opposite order. We say these q-numbers are non-commutative, the order of multiplication matters. Boldface type is used because position and momentum are vectors and the equation actually applies to each of their 3 components. Furthermore, the FQR tells us exact size of the non-commute. In usual human sized physical units ħ is .00…001054… where there are 33 zeros before the 1054. If we can ignore the size of ħ and set it to zero, p and q, then commute, can be considered c-numbers and we’re back to classical physics. Incidentally, Heisenberg, Born and Jordan obtained the FQR using p and q as infinite matrices and it can be derived also using Schrödinger’s differential operators. It is interesting to note that by using his new abstract algebra, Dirac not only obtained the FQR but could calculate the energy levels of the hydrogen atom. Only later did physicists obtain that result using Heisenberg’s matrices. Sometimes the deep abstract leads to surprisingly concrete results.
For most physicists in 1926, the big excitement was Schrödinger’s equation. Partial differential equations were a familiar tool, while matrices were at that time known mainly to mathematicians. The “old quantum theory” had made a few forays into one or another area leaving the fundamentals of atomic physics and chemistry pretty much in the dark. With Schrödinger’s equation, light was thrown everywhere. One could calculate how two hydrogen atoms were bound in the hydrogen molecule. Then using that binding as a model one could understand various bindings of different molecules. All of chemistry became open to theoretic treatment. The helium atom with its two electrons couldn’t be dealt with at all by the old quantum theory. Using various approximation methods, the new theory could understand in detail the helium atom and other multielectron atoms. Electrons in metals could be modeled with the Schrödinger’s equation, and soon the discovery of the neutron opened up the study of the atomic nucleus. The old quantum theory was helpless in dealing with particle scattering where there were no closed orbits. Such scattering was easily accommodated by the Schrödinger equation though the detailed calculations were far from trivial. Over the years quantum theory revealed more and more practical knowledge and most physicists concentrated on experiments and theoretic calculations that led to such knowledge with little concern about what the new theory meant in terms of physical reality.
However, back in the first few years after 1925 there was a great deal of concern about what the theory meant and the question of how it should be interpreted. For example, under Schrödinger’s theory an electron was represented by a “cloud” of numbers which could travel through space or surround an atom’s nucleus. These numbers, called the wave function and typically named ψ, were complex, of the form a + ib, where i is the square root of -1. By multiplying such a number by its conjugate a – ib, one gets a positive (strictly speaking, non-negative) number which can perhaps be physically interpreted. Schrödinger himself tried to interpret this “real” cloud as a negative electric change density, a blob of negative charge. For a free electron, outside an atom, Schrödinger imagined that the electron wave could form what is called a “wave packet”, a combination of different frequencies that would appear as a small moving blob which could be interpreted as a particle. This idea definitely did not fly. There were too many situations where the waves were spread out in space, before an electron suddenly made its appearance as a particle. The question of what ψ meant was resolved by Max Born (see Wikipedia), starting with a paper in June, 1926. Born interpreted the non-negative numbers ψ*ψ (ψ* being the complex conjugate of the ψ numbers) as a probability distribution for where the electron might appear under suitable physical circumstances. What these physical circumstances are and the physical process of the appearance are still not completely resolved. Later in this or another blog post I will go into this matter in some detail. In 1926 Born’s idea made sense of experiment and resolved the wave-particle duality of the old quantum theory, but at the cost of destroying classical concepts of what a particle or wave really was. Let me try to explain.
A simple example of a classical probability distribution is that of tossing a coin and seeing if it lands heads or tails. The probability distribution in this case is the two numbers, ½ and ½, the first being the probability of heads, the second the probability of tails. The two probabilities add up to 1 which represents certainty, in probability theory. (Unlike the college students who are trying to decide whether to go drinking, go to the movies or to study, I ignore the possibility that the coin lands on its edge without falling over.) With the wave function product ψ*ψ, calculus gives us a way of adding up all the probabilities, and if they don’t add up to 1, we simply define a new ψ by dividing by the sum we obtained. (This is called “normalizing” the wave function.) Besides the complexity of the math, however, there is a profound difference between the coin and the electron. With the coin, classical mechanics tells us in theory, and perhaps in practice, precisely what the position and orientation of the coin is during every instant of its flight; and knowing about the surface the coin lands on, allows us to predict the result of the toss in advance. The classical analogy for the electron would be to imagine it is like a bb moving around inside the non-zero area of the wave function, ready to show up when conditions are propitious. With QM this analogy is false. There is no trajectory for the electron, there is no concept of it having a position, before it shows up. Actually, it is only fairly recently that the “bb in a tin can model” has been shown definitively to be false. I will discuss this matter later talking briefly about Bell’s theorem and “hidden” variable ideas. However, whether or not an electron’s position exists prior to its materialization, it was simply the concept of probability that Einstein and Schrödinger, among others, found unacceptable. As Einstein famously put it, “I can’t believe God plays dice with the universe.”
Max Born, who introduced probability into fundamental physics, was a distinguished physics professor in Göttingen and Heisenberg’s mentor after the latter first came to Göttingen from Munich in 1922. Heisenberg got the breakthrough for his theory while escaping from hay fever in the spring of 1925 walking the beaches of the bleak island of Helgoland in the North Sea off Germany. Returning to Göttingen, Heisenberg showed his work to Born who recognized the calculations as being matrix multiplication and who saw to it that Heisenberg’s first paper was immediately published. Born then recruited Pascual Jordan from the math department at Göttingen and the three wrote a famous follow-up paper, Zur Quantenmechanik II, Nov, 1925, which gave a complete treatment of the new theory from a matrix mechanics point of view. Thus, Born was well posed to come up with his idea of the nature of the wave function.
Quantum Mechanics came into being during the amazingly short interval between mid-1925 and the end of 1926. As far as the theory went, only “mopping” up operations were left. As far as the applications were concerned there was a plethora of “low hanging fruit” that could be gathered over the years with Schrödinger’s equation and Born’s interpretation. However, as 1927 dawned, Heisenberg and many others were concerned with what the theory meant, with fears that it was so revolutionary that it might render ambiguous the meaning of all the fundamental quantities on which both the new QM and old classical physics depended. In 1925 Heisenberg began his work on what became the matrix mechanics because he was skeptical about the existence of Bohr orbits in atoms, but his skepticism did not include the very concept of “space” itself. As QM developed, however, Heisenberg realized that it depended on classical variables such as position and momentum which appeared not only in the pq commutation relation but as basic variables of the Schrödinger equation. Had the meaning of “position” itself changed? Heisenberg realized that earlier with Einstein’s Special Relativity that the meaning of both position and time had indeed changed. (Newton assumed that coordinates in space and the value of time were absolutes, forming an invariable lattice in space and an absolute time which marched at an unvarying pace. Einstein’s theory was called Relativity because space and time were no longer absolutes. Space and time lost their “ideal” nature and became simply what one measured in carefully done experiments. (Curiously enough, though Einstein showed that results of measuring space and time depended on the relative motion of different observers, these quantities changed in such an odd way that measurements of the speed c of light in vacuum came out precisely the same for all observers. There was a new absolute. A simple exposition of special relativity is N. David Mermin’s Space and Time in Special Relativity.)
The result of Heisenberg’s concern and the thinking about it is called the “Uncertainty Principle”. The statement of the principle is the equation ΔqΔp = ħ. The variables q and p are the same q and p of the Fundamental Quantum Relation and, indeed, it is not difficult to derive the uncertainty principle from the FQR. The symbol delta, Δ, when placed in front of a variable means a difference, that is an interval or range of the variable. Experimentally, a measurement of a variable quantity like position q is never exact. The amount of the uncertainty is Δq. The uncertainty equation above thus says that the uncertainty of a particle’s position times the uncertainty of the same particle’s momentum is ħ. In QM what is different from an ordinary error of measurement is that the uncertainty is intrinsic to QM itself. In a way, this result is not all that surprising. We’ve seen that the wave function ψ for a particle is a cloud of numbers. Similarly, a transformed wave function for the same particle’s momentum is a similar cloud of numbers. The Δ’s are simply a measure of the size of these two clouds and the principle says that as one becomes smaller, the other gets larger in such a way that their product is h bar, whose numerical value I’ve given above.
In fact, back in 1958 when I was in Eikenberry’s QM course and we derived the uncertainty relation from the FQR, I wondered what the big deal was. I was aware that the uncertainty principle was considered rather earthshaking but didn’t see why it should be. What I missed is what Heisenberg’s paper really did. The equation I’ve written above is pure theory. Heisenberg considered the question, “What if we try to do experiments that actually measure the position and momentum. How does this theory work? What is the physics? Could experiments actually disprove the theory?” Among other experimental set-ups Heisenberg imagined a microscope that used electromagnetic rays of increasingly short wavelengths. It was well known classically by the mid-nineteenth century that the resolution of a microscope depends on the wavelength of the light it uses. Light is an electromagnetic (em) wave so one can imagine em radiation of such a short wavelength that it could view with a microscope a particle, regardless of how small, reducing Δq to as small a value as one wished. However, by 1927 it was also well known because of the Compton effect that I talked about in the last post, that such em radiation, called x-rays or gamma rays, consisted of high energy photons which would collide with the electron giving it a recoil momentum whose uncertainty, Δp, turns out to satisfy ΔqΔp = ħ. Heisenberg thus considered known physical processes which failed to overturn the theory. The sort of reasoning Heisenberg used is called a “thought” experiment because he didn’t actually try to construct an apparatus or carry out a “real” experiment. Before dismissing thought experiments as being hopelessly hypothetical, one must realize that any real experiment in physics or in any science for that matter, begins as a thought experiment. One imagines the experiment and then figures out how to build an apparatus (if appropriate) and collect data. In fact, as a science progresses, many experiments formerly expressed only in thought, turn real as the state of the art improves.
Although the uncertainty principle is earthshaking enough that it helped confirm the skepticism of two of the main architects of QM, namely, Einstein and Schrödinger, one should note that, in practice, because of the small size of ħ, the garden variety uncertainties which arise from the “apparatus” measuring position or momentum are much larger than the intrinsic quantum uncertainties. Furthermore, the principle does not apply to c-numbers such as e, the fundamental electron or proton charge, c, the speed of light in vacuum, h, Planck’s constant. There is an interesting story here about a recent (Fall, 2018) redefinition of physical units which one can read about on line. Perhaps I’ll have more to say about this subject in a later post. For now, I’ll just note that starting on May 20, 2019, Planck’s constant will be (or has been) defined as having an exact value of 6.626070150×10¯³⁴ Joule seconds. There is zero uncertainty in this new definition which may be used to define and measure the mass of the kilogram to higher accuracy and precision than possible in the past using the old standard, a platinum-iridium cylinder, kept closely guarded near Paris. In fact, there is nothing muddy or imprecise about the value of many quantities whose measurement intimately involves QM.
During the years after 1925 there was at least one more area which in QM was puzzling to say the least; namely, what has been called “the collapse of the wave function.” Involved in the intense discussions over this phenomenon and how to deal with it was another genius I’ve scarcely mentioned so far; namely Wolfgang Pauli. Pauli, a year older than Heisenberg, was a year ahead of him in Munich studying under Sommerfeld, then moved to Göttingen, leaving just before Heisenberg arrived. Pauli was responsible for the Pauli Exclusion Principle based on the concept of particle spin which he also explicated. (see Wikipedia) He was in the thick of things during the 1925 – 1927 time period. Pauli ended up as a professor in Zurich, but spent time in Copenhagen with Bohr and Heisenberg (and many others) formulating what became known as the Copenhagen interpretation of QM. Pauli was a bon vivant and had a witty sarcastic tongue, accusing Heisenberg at one point of “treason” for an idea that he (Pauli) disliked. In another anecdote Pauli was at a physics meeting during the reading of a muddy paper by another physicist. He stormed to his feet and loudly said, “This paper is outrageous. It is not even wrong!” Whether the meeting occurred at a late enough date for Pauli to have read Popper, he obviously understood that being wrong could be productive, while being meaningless could not.
Over the next few years after 1927 Bohr, Heisenberg, and Pauli explicated what came to be called “the Copenhagen interpretation of Quantum Mechanics”. It is well worth reading the superb article in Wikipedia about “The Copenhagen Interpretation.” One point the article makes is that there is no definitive statement of this interpretation. Bohr, Heisenberg, and Pauli each had slightly different ideas about exactly what the interpretation was or how it worked. However, in my opinion, things are clear enough in practice. The problem QM seems to have has been called the “collapse of the wave function.” It is most clearly seen in a double slit interference experiment with electrons or other quantum particles such as photons or even entire atoms. The experiment consists of a plate with two slits, closely enough spaced that the wave function of an approaching particle covers both slits. The spacing is also close enough that the wavelength of the particle as determined by its energy or momentum, is such that the waves passing through the slit will visibly interfere on the far side of the slit. This interference is in the form of a pattern consisting of stripes on a screen or photographic plate. These stripes show up, zebra like, on a screen or as dark, light areas on a developed photographic plate. On a photographic plate there is a black dot where a particle has shown up. The striped pattern consists of all the dots made by the individual particles when a large number of particles have passed through the apparatus. What has happened is that the wave function has “collapsed” from an area encompassing all of the stripes, to a tiny area of a single dot. One might ask at this point, “So what?” After all, for the idea of a probability distribution to have any meaning, the event for which there is a probability distribution has to actually occur. The wave function must “collapse” or the probability interpretation itself is meaningless. The problem is that QM has no theory whatever for the collapse.
One can easily try to make a quantum theory of what happens in the collapse because QM can deal with multi-particle systems such as molecules. One obtains a many particle version of QM simply by adding the coordinates of the new particles, which are to be considered, to a multi-particle version of the Schrödinger equation. In particular, one can add to the description of a particle which approaches a photographic plate, all the molecules in the first few relevant molecular layers of the plate. When one does this however, one does not get a collapse. Instead the new multi-particle wave function simply includes the molecules of the plate which are as spread out as much as the original wave function of the approaching particle. In fact, the structure of QM guarantees that as one adds new particles, these new particles themselves continue to make an increasingly spread out multi-particle wave function. This result was shown in great detail in 1929 by John von Neumann. However, the idea of von Neumann’s result was already generally realized and accepted during the years of the late 1920’s when our three heroes and many others were grappling with finding a mechanism to explain the experimental collapse. Bohr’s version of the interpretation is simplicity itself. Bohr posits two separate realms, a realm of classical physics governing large scale phenomena, and a realm of quantum physics. In a double slit experiment the photographic plate is classical; the approaching particle is quantum. When the quantum encounters the classical, the collapse occurs.
The Copenhagen interpretation explains the results of a double slit experiment and many others, and is sufficient for the practical development of atomic, molecular, solid state, nuclear and particle physics, which has occurred since the late 1920’s. However, there has been an enormous history of objections, refinements, rejections and alternate interpretations of the Copenhagen interpretation as one might well imagine. My own first reaction could be expressed as the statement, “I thought that ‘magic’ had been banned from science back in the 17th century. Now it seems to have crept back in.” (At present I take a less intemperate view.) However, one can make many obvious objections to the Copenhagen interpretation as I’ve baldly stated it above. Where, exactly, does the quantum realm become the classic realm? Is this division sharp or is there an interval of increasing complexity that slowly changes from quantum to classical? Surely, QM, like the theory of relativity, actually applies to the classical realm. Or does it?
During the 1930’s Schrödinger used the difficulties with the Copenhagen interpretation to make up the now famous thought experiment called “Schrödinger’s Cat.” Back in the early 1970’s when I became interested in the puzzle of “collapse” and first heard the phrase “Schrödinger’s Cat”, it was far from famous so, curious, I looked it up and read the original short article, puzzling out the German. In his thought experiment Schrödinger uses the theory of alpha decay. An alpha particle confined in a radioactive nucleus is forever trapped according to classical physics. QM allows the escape because the alpha particle’s wave function can actually penetrate the barrier which classically keeps it confined. Schrödinger imagines a cat imprisoned in a cage containing an infernal apparatus (hollenmaschine) which will kill the cat if triggered by an alpha decay. Applying a multi-particle Schrödinger’s equation to the alpha’s creeping wave function as it encounters the trigger of the “maschine”, its internals, and the cat, the multi-particle wave function then contains a “superposition” (i.e. a linear combination) of a dead and a live cat. Schrödinger makes no further comment leaving it to the reader to realize how ridiculous this all is. Actually, it is even worse. According to QM theory, when a person looks in the cage, the superposition spreads to the person leaving two versions, one looking at a dead cat and one looking at a live cat. But a person is connected to an environment which also splits and keeps splitting until the entire universe is involved.
What I’ve presented here is an actual alternative to the Copenhagen Interpretation called “the Many-worlds interpretation”. To quote from Wikipedia “The many-worlds interpretation is an interpretation of quantum mechanics that asserts the objective reality of the universal wavefunction and denies the actuality of wavefunction collapse. Many-worlds implies that all possible alternate histories and futures are real, each representing an actual ‘world’ (or ‘universe’).” The many-worlds interpretation arose in 1957 in the Princeton University Ph.D. dissertation of Hugh Everett working under the direction of the late John Archibald Wheeler, who I mentioned in the last post. Although I am a tremendous admirer of Wheeler, I am skeptical of the many-worlds interpretation. It seems unnecessarily complicated, especially in light of ideas that have developed since I noticed them in 1972. There is no experimental evidence for the interpretation. Such evidence might involve interference effects between the two versions of the universe as the splitting occurs. Finally, if I exist in a superposition, how come I’m only conscious of the one side? Bringing in “consciousness” however, leads to all kinds of muddy nonsense about consciousness effects in wave function splitting or collapse. I’m all for consciousness studies and possibly such will be relevant for physics after another revolution in neurology or physics. At present we can understand quantum mechanics without explicitly bringing in consciousness.
In the next post I’ll go into what I noticed in 1971-72 and how this idea subsequently became developed in the greater physics community. The next post will necessarily be somewhat more mathematically specific than so far, possibly including a few gory details. I hope that the math won’t obscure the story. In subsequent posts I’ll revert to talking about physics theory without actually doing any math.