Teaching Myself Quantum Mechanics, Part one.
This series is about me learning something in my spare time. More specifically, it’s about me learning quantum mechanics, as a way of getting back into physics, and as a way of making good science.
I’ll explain: I have recently started a PhD in machine learning at the University of Geneva. The main project I’ll be working on is about applying machine learning and statistical methods to physics, mainly high energy physics (HEP) and solar astronomy. I am part of a very strong multidisciplinary team of physicists and AI people, so I could probably do without a strong physics background. Still, I feel like that can’t hurt, right?
So there you have it: I want to study and understand physics in order to do better research.
But that’s not the whole story. I also love physics, and I have tortured myself over and over as to whether I should have majored in physics rather that CS during my university career. This is not to say that I am any good at it. I only had one big physics exam during my bachelor (topics were kinematics, some classical mechanics, electromagnetism), and all the rest of my knowledge is made of bits and pieces coming from random youtube lectures and non-technical books I have read over the years. So yeah, it’s finally time to get serious.
Why start with Quantum Mechanics?
Because I find it exciting, and that’s probably enough. Starting with classical mechanics would likely bore me to death. I know QM is not easy or intuitive, and that math can get hard. However, I don’t feel completely ill-equipped. Thanks to my engineering and ML background, I know one thing or two about probabilities, linear algebra, Fourier series and complex numbers, which I have been told are a substantial part of the required background.
What is your study method?
I usually try to be extremely deliberate about my study method. I won’t go into the details here, but I use active recall and spatial repetition tools such as Anki so that remembering what I study is no longer a random event but a purposeful act. There are many interesting resources about such techniques, such as this, and this. Michael Nielsen has even developed an introduction to quantum computing delivered in a “new mnemonic medium which makes it almost effortless to remember what you read” (hint: it’s active recall and spaced repetition).
As for the learning resources, I have chosen two introductory books: one is from the theoretical minimum series from the great Leonard Susskind. This is supposed to be a very simple introduction, starting from first principles. I have already flipped through the pages, and there actually seems to be quite a lot of content, or at least more than you would expect from a book of this kind.
The second book it more technical, and it comes up over and over as a suggestion for learning QM: Griffiths’ Introduction to Quantum Mechanics. Honestly, here I have no idea what to expect.
So here’t the plan: I’ll go through both books simultaneously and try to solve most exercises. I’ll also create flashcards and diagrams of the concepts, and let Anki do the rest. Additionally, I’ll write this blog series, which hopefully helps me understand things better. So if you are here to learn, don’t just count on my writing.
Requirements
The requirements are basic linear algebra and basic probability. And complex numbers. Some notions of classical physics can also help.
Spins and Qubits
The fact that we cannot fully grasp some things about the inner workings of our universe should come at no surprise. After all, we evolved to survive, not to understand. Ok yes, we have a strong intuition for many classical-mechanical things, but the reason is that things that behave classically are closer to our experience than things that behave relativistically or in a quantum mechanical way. So what we can do here is simple: let’s just go with the math and with the experiment. I don’t expect any of the concepts I encounter to be “relatable”. I am going to trust the result of the experiments, and link that to the math which hopefully explains them. And I’ll start with the spin.
Sooo, the spin is an extra degree of freedom that some particle have (all of them? I don’t know).
This spin is a very quantum mechanical concept, which is why trying to visualize it classically would miss the point.
Now, the spin of a particle is a quantum systems on its own right. In fact, it’s the “simplest and the most quantum of systems”, according to Susskind. Just by going through a very simple experiment involving the spin, we can already start to notice some important differences between classical physics and quantum mechanics.
The experiment involves a measuring system, which we call \(\mathcal{A}\), that records the state of the spin of our system (we’ll call it \(\sigma\)). We can orient the measuring apparatus in order to measure this “something” along a specified axis. As a test, we rotate the apparatus along the \(z\)-axis, and notice that our measurement has the value of \(1\) or \(-1\).
Now, if we reset the apparatus to measure again the spin along the \(z\)-axis, and assume the simple evolution law
\[\sigma(n+1) = \sigma(n)\]which tells us that the state of the system remains unchanged through time, then the previous measurement should be confirmed, again and again. We say that the measurement of a state prepares the system.
In quantum experiments, performing the same measurement repeatedly gives the same answer until the system is prepared differently. We will see this in more detail now.
After we have prepared the system in the state \(\sigma_z = 1\), let’s now rotate the apparatus of, say, 90 degrees, and measure the spin along the \(x\)-axis. Something weird happens. Sometimes we get +1, sometimes we get -1. It looks like the result is no longer deterministic, but in some peculiar way. In fact, if we repeat the experiment multiple times, we find that the average value of the measurement along the \(x\)-axis is \(0\).
In fact, this is generalizable to any direction. We can pick any direction \(\hat{m}\) and prepare a spin so that the apparatus measures a \(+1\) in that direction. Then, we rotate the apparatus to any other direction \(\hat{n}\). Again, the result of the measurement will still be either \(+1\) or \(-1\), but the expected value will be equal to the projection of \(\hat{m}\) over \(\hat{n}\) , that is, \(\cos{\theta}\), with \(\theta\) being the angle between the vectors.
In bra-ket notation (we’ll talk about this more in detail):
\[\langle{\sigma}\rangle = \hat{m} \cdot\hat{n} = \cos{\theta}\]Notice that this is the same result that we would expect in classical physics for some vector quantity that we measure. However, in the classical sense, the result would be deterministic. In QM, the result is statistically random, but the average converges to the classical result.
Vector spaces, inner products, bases.
In quantum mechanics the space of the states is a vector space. We call vectors ket. This is a ket: \(\vert a\rangle\). Some simple axioms are defined in this space: the sum of two kets is a ket, ket addition is commutative and associative, and some other things like the existence of a 0 vector, an additive inverse, and linearity. Up to this point, everything looks normal, and the vector space we have defined is practically the same as the space of 3-vectors in Euclidean space. However, the space of our kets is a complex vector space, made of complex numbers, and where the multiplication by a scalar value also extends to complex numbers! This is a very abstract concept, but it’s absolutely needed to make the theory work mathematically.
Now, if you know something about complex vector, you will likely remember about the complex conjugate. Briefly, given a complex number \(z\), there exists a complex conjugate \(z^*\) that we obtain by reversing the sign of the imaginary part. So if \(z=x+iy = re^{i\theta}\), then \(z^*=x-iy = re^{-i\theta}\). Note that the product of a complex number and its complex conjugate is always positive, yielding \(r^2\) in our case.
In the same way, a complex vector space has a dual space that is its complex conjugate vector space. In our case, for every ket \(\vert A\rangle\), we can define a bra vector in that dual space: \(\langle A \vert\), which is essentially its complex conjugate.
Now, if you got up to this point, I am pretty sure you remember about dot-products in Euclidean space, right? Well, that dot-product is a specific instance of so-called inner products, which we can also define in our complex space. In out quantum-mechanical magical world, the inner product is defined between a bra and a ket, and the notation is
\[\langle B|A \rangle\]and, of course, the result is a complex number. These products are linear, and interchanging bras and kets corresponds to complex conjugation
\[\langle B|A \rangle = \langle A|B \rangle ^*\]Concretely, the product is performed in the exact same way as for the dot-product: sum of the products of the components.
We can also bring with us some familiar concepts from Euclidean spaces:
- A vector is normalized if its inner products with itself is 1.
- Two vectors are orthogonal if their inner products is 0.
In our vector space, we can also define a basis, which is simply a set of vector that can be used to derive any other vector in the space through linear combinations. It’s often useful, and it is especially in QM, to talk about orthonormal bases, which are bases where the vectors are normalized and orthogonal between each other. Note that the number of vectors in a basis is equivalent to the dimension of the space.
Quantum States
With some math tools under our belt, let’s try to formalize the notion of a spin state from earlier. For this task, we will use vectors, and create a representation that captures what we know about how spins behave.
Let’s start but labelling all the possible spin states along the three axes. When the apparatus \(\mathcal{A}\) is oriented along the \(z\)-axis, the two possible states are \(\sigma_z = \pm1\). We can call them up and down states and assign them ket vectors \(\vert u\rangle\) and \(\vert d\rangle\). So, when \(\mathcal{A}\) is oriented along the \(z\)-axis and registers \(+1\), the system is in state \(\vert u\rangle\). We also define the states for the directions along the \(x\)-axis and call them \(\vert r\rangle\) and \(\vert l\rangle\) (for right and left), and finally along the \(y\)-axis which we call \(\vert i\rangle\) and \(\vert o\rangle\) (in and out).
Now, a consequence of this formalization is that the space of the states for a single spin has only two dimensions. This means that all possible states of a spin can be represented in a two-dimensional vector space. If we choose \(\vert u\rangle\) and \(\vert d\rangle\) as the basis vectors for this space, following by the definition of a basis, we can then write any other state as a linear combination (or superposition) of these two vectors.
Thus, a generic state \(\vert A\rangle\) comes in the form
\[|A\rangle = \alpha_u|u\rangle + a_d|d\rangle\]where \(\alpha_u\) and \(\alpha_d\) are the components of \(\vert A\rangle\) along the two basis vectors.
We can retrieve these components through an inner product (which is essentially a projection) of the state vector on the respective basis vectors:
\[\alpha_u = \langle u|A\rangle \\ \alpha_d = \langle d|A\rangle \\\]It is important to remember that these components are complex number, and carry no physical meaning by themselves. However, their magnitude does. In fact, given that the spin has been prepared in state \(\vert A\rangle\), and that the apparatus is oriented along \(z\), the quantity \(\alpha_u^* \alpha_u\) is the probability that the spin would be measured as \(\sigma_z = +1\) (an up spin along the \(z\)-axis).
Obviously, the same holds for \(\alpha_d^*\alpha_d\), which is the probability of measuring \(\sigma_z = -1\).
To be more precise, the quantities \(\alpha_u\) and \(\alpha_d\) are classed probability amplitudes and are not actual probabilities. To get actual probabilities, these quantities need to be squared, so that
\[P_u = \langle A|u\rangle\langle u|A\rangle \\ P_d = \langle A|d\rangle\langle d|A\rangle\]Something important to notice is the state \(\vert A\rangle\) is not what we measure. \(\vert A\rangle\) is something we know before the measurement. It is our knowledge of the state of the system, or how the system was prepared. It represents the potential possibilities of the values of our measurements. Once we measure the spin along a certain direction, however, we can only get a \(\vert u\rangle\) or a \(\vert d\rangle\).
Two additional points are important. First, \(\vert u\rangle\) an \(\vert d\rangle\) have to be mutually orthogonal, meaning that
\[\langle u|d\rangle = 0 \\ \langle d|u\rangle = 0 \\\]but what does this mean? It means that the up and down states are physically distinct and mutually exclusive. If the spin is prepared in the up state, then it can’t be detected to be in the down state, and viceversa. This is true not just for the spin, but for any quantum system.
Another point is that, since up and down are the only possible result of the measurement, their respective probabilities need to sum up to 1. Mathematically,
\[\alpha_u^*\alpha_u + \alpha_d^*\alpha_d =1\]which is the same as saying that \(\vert A\rangle\) is normalized:
\[\langle A| A\rangle = 1\]And this is the last thing I’ll state for today, but it’s a very important principle: the state of a system is represented by a unit vector in a vector space of states. Moreover, the squared magnitudes of the components of the state-vector, along particular basis vectors, represent probabilities for various experimental outcomes. Prof. Susskind said that so it must be true.
What about next time?
To recap, we made some spin experiment, looked at the weird results we were getting, and developed a small part of a mathematical framework to represent these result, work with them, and one day make predictions.
I think this is enough for today, as we went thought quite some stuff. In the next episode, we’ll be deriving state vectors for the spin in any direction, and state the principles of quantum mechanics!
See ya!