A central tenet of quantum information theory is that by taking an operational approach to quantum mechanics (for which computational, cryptographic or information-theoretic tasks does the use of quantum mechanics present an advantage?) one can not only achieve novel tasks, but one also ends up developping important insights on the theory itself — insights which, in the case of quantum mechanics, are sorely needed! This credo is best exemplified by the use of entanglement in quantum information: instead of the infamous “headache” discovered by Einstein, Podolsky and Rosen, it plays the role of a resource using which many fantastic tasks can be accomplished; superdense coding and teleportation are classic examples, but we will see a few more in this post. This paradigm shift — from headache to resource — is described in a short survey by Popescu and Rohrlich published in 1998 and appropriately entitled “The Joy of Entanglement”, to which I happily refer the interested reader.
This semester (or what remains of it) at MIT I am organizing a small reading group whose focus will be the study of entanglement as a resource for cryptography. We hope to explore both aspects of the credo: insights on entanglement gained through its use in cryptography, and cryptographic tasks that are made possible by the unique properties of entanglement. This post is meant as a small introduction to the reading group. I will first describe Bell’s theorem, to which any serious use of entanglement must naturally refer; I will then briefly survey a few consequences for cryptography of the “violation of Bell inequalities” by entanglement.
(This post is probably too long to qualify as a honest blog post…be warned! Although I have tried to provide abundant references, this should by no means be thought of as a comprehensive “history” of the subject, and I am sure to have left out many key contributions — please point them out, or any further suggestions for readings, in the comments; they will certainly be very valuable to the reading group. For a much more thorough treatment I refer to the recent survey by Brunner et al.)
1.1. The EPR paradox
Einstein, Podolsky and Rosen in their famous 1935 paper were arguably the first to put forward the counter-intuitive consequences of the laws of quantum mechanics, and in particular the superposition principle. EPR observed that by applying this principle to a pair of particles, instead of a single “cat”, one can create a bipartite state with the property that any observation made on one particle automatically “projects” the other particle to a state that is consistent with the outcome of the first measurement that was performed. This “spooky action at a distance” displeased Einstein, and the EPR paper tries to concretize this displeasure by arguing that it contradicts the uncertainty principle: in short, the outcomes of two incompatible measurements could be simultaneously determined by performing one on the first particle and the other on the second. Hence, EPR concluded, there must be something wrong with the description of physical states as posited by the laws of quantum mechanics: the uncertainty principle must be an artifice of some more fundamental flaw, or incompleteness, in the theory. (See this article in the Stanford Encyclopedia of Philosophy for a much more thorough discussion of the EPR argument.)
The word entanglement itself does not appear in the EPR paper, but was first used by Schrödinger in a letter (in German) written in response to that paper. Shortly afterwards, Schrödinger wrote a paper in which he introduces entanglement as “not one but rather the characteristic trait of quantum mechanics” (emphasis his). I won’t reproduce Schrödinger’s definition here, though it is interesting to watch how he struggles in delimiting what entanglement is and what it is not; it was (and still is not) clearly not a simple concept to grasp.
In his paper Schrödinger pursued the EPR line of thought in deriving further “paradoxical” consequences from the existence of entanglement. Many attempts were subsequently made to “repair” quantum mechanics in a way that avoided these apparent paradoxes. Suggestions included the possibility that the laws of quantum mechanics would break down at large distances (such as those that seemed to be required between the two particles in the EPR experiment), that quantum mechanical particles could, in some cases, have an instantaneous action on one another (for instance through the existence of an underlying “quantum field”), or that quantum mechanics was an incomplete theory and required the introduction of additional “hidden variables” to accurately describe the state of physical systems.
This flurry of more-or-less plausible attempts reflects a certain state of confusion; in retrospect it seems clear that the consequences of the experiment described by EPR were poorly understood. One obvious difficulty is that it was really a thought experiment, with all the uncertainty that this implies. Since no clear experimental validation was foreseeable the floor was open to all kinds of suggestions, including of course that the experiment would simply not work out as predicted by quantum mechanics. It seems like the first to describe an experiment that could (in)validate the EPR predictions were Bohm and Aharonov, who suggested the use of quantum mechanical degrees of freedom of photons (angular momentum) instead of atoms (position). Their reformulation inspired the work of John Bell, who in 1964 finally put the EPR “paradox” on firm ground by stating three basic assumptions about the world that the paradox violated, and proposing an experiment that could conclusively demonstrate this violation.
1.2. Bell’s theorem
Bell identified precisely which of our implicit beliefs we had to abandon in order to remove the “paradox” from “EPR paradox”. He stated three natural axioms (reformulated as below) that apply to the kinds of correlations that can be observed by performing measurements on distant physical systems. Bell went on to observe, as EPR had, that quantum mechanics posits the existence of experiments which would violate these three axioms. Bell’s conclusion, however, was not that the theory is wrong — rather, one of the axioms must be false, and the “intuition” that led to this axiom misplaced.
Let me describe the context for Bell’s experiment. Suppose two experimentalists, call them Alice and Bob, perform measurements on physical systems situated in their respective labs. Alice has a choice of measurements she can make, each with a set of possible outcomes; similarly for Bob. Let’s index Alice’s (resp. Bob’s) measurements by (resp. ) and their outcomes by (resp. ). From the point of view of an outside observer the whole experiment can be described by a set of probability distributions
where for every it holds that . Let’s introduce one more variable, , which describes the state of the joint system held by Alice and Bob before they perform their measurements. Using Bayes’ rule,
Bell identified the following three reasonable assumptions, that I’ll refer to as in the following:
- Measurement independence (“free will”): the state is independent of (since Alice and Bob can make their choice of measurement settings independently and freely), .
- No-signaling (“no violation of special relativity”): since Alice and Bob perform their measurements in locations that are arbitrarily far apart, the outcome of Alice’s experiment cannot contain any information about Bob’s own choice of measurement. Formally, for any choice at Alice’s side and any two choices at Bob’s side, the marginal distribution on outcomes that Alice observes must be the same whether Bob decides to make measurement or :
- Outcome independence (“local realism”): since the experiments are performed locally, the outcome (resp. ) can be directly computed as a function of the state of the system, , and the choice of measurement (resp. ). Formally,
Most of us are familiar with distributions of the form (1): they are exactly those distributions that can be generated with the help of both shared () and private randomness. This clearly conforms to our intuition: if Alice and Bob are in separate locations and cannot communicate, then this is exactly the set of distributions they can generate.
What Einstein, Podolsky and Rosen observed, and Bell proved, is that quantum mechanics predicts there exists specific experiments that generate distributions that do not satisfy~(1). The experiment considered by Bell, although conceptually simple, would have been hard to verify directly as it involves a continuous choice of possible measurements. An important simplification was made by Clauser, Horne, Shimony and Holt, who built upon the work of Bell and proved the following inequality (see this previous blog post for an interpretation of the inequality as a two-player game, and this one for a simple proof):
Exercise 2. Show that all of the form (1) with satisfy the following inequality
In addition, Clauser et al. calculated (as was already well known) that quantum mechanics predicts that if Alice and Bob each measure their respective halves of an EPR pair using arbitrary orthonormal bases and respectively, the probability that they obtain the pair of outcomes is
(See here for an introduction to the quantum measurement formalism in this context.) Bell’s theorem is that there exists families of distributions of the form~(3) which cannot be put in the form~(1). Although the two forms do look different, how to convince oneself that this is the case? Bell in his original paper gave an elegant proof of this fact. The following exercise asks you to prove it by using inequality (2).
Exercise 3. Find two orthonormal bases , for , and two orthonormal bases , for , such that by replacing in the left-hand side of~(2) with the corresponding expression obtained through~(3), you obtain a value of . (As proved by Tsirelson, is the optimal value achievable by quantum mechanics: see this blog post for a proof). One can go further: find a family of distributions satisfying both assumptions 1. and 2. above (but not necessarily 3.) and that achieves a value of in the left-hand side of (2). This shows that quantum mechanics is not “maximally nonlocal”…Can you find reasons why?
There we are: quantum mechanics violates the “Bell-CHSH inequality” (2). More importantly, not only does quantum mechanics predict a violation, but this prediction has been observed! The first experiments are due to Aspect and collaborators in the 80s. Aspect initially verified the violation of the CHSH inequality using polarization measurements on pairs of entangled photons. The observed value in (2) was calculated by estimating the distribution generated by making all possible pairs of measurements on the photons, in locations sufficiently far apart that the no-signaling condition was guaranteed by relativity. Unfortunately, the specific equipment used required the choice of measurements to be fixed long in advance — hence, in principle, both entangled photons could have “known” in which basis they were going to be measured and decide on their state accordingly, so that assumption 1. of measurement independence was not guaranteed. Shortly afterwards, Aspect et al. reported a second experiment in which they closed this loophole by using “time-varying analyzers”, which enabled them to make the basis choice on the fly.
There still remained one potential loophole however, which stems from the very low efficiency of photon detectors (in the case of Aspect’s experiment, between and of emitted photon pairs resulted in a detection). In order to conclude to a violation of (2) one needs to make a “fair sampling assumption”: undetected photon pairs are sampled uniformly among all emitted pairs and estimating statistics conditioned on detection does not induce any significant bias. This, of course, is an assumption that we would rather get rid of, and experiments since Aspect’s have thrived to remove it. Detector efficiencies for photons now reach the ; it is known that the detection loophole can be closed by having an efficiency higher than . (Direct comparison of numbers can be tricky: for instance, it is already possible to achieve detection efficiencies close to , for systems that e.g. do not easily allow for large separations, leaving the “locality loophole” open, or for which making an independent choice of measurements is not feasible, re-opening the “free will loophole”…) According to a few experimentalists I talked to, many groups around the world are now racing to be the first to simultaneously close all loopholes, and this is expected to happen within the next 3-5 years: stay tuned!
2. Using entanglement for cryptography
In the remainder of this post I will point to a few concrete consequences of the experimentally observable violation of Bell inequalities (and in particular of (2)) by certain physical setups. Some of the applications will depend on just this — the statistically verifiable generation of specific correlations between isolated systems. In those cases no prior knowledge of (or belief in) quantum mechanics will be required. Other applications will rely on more precise modeling, in which case we may have to assume correctness of quantum mechanics. Throughout the reading group we will explore some of these consequences further, attempting to get a good grasp of their technical underpinnings — in terms of the unique properties of general nonlocal distributions, and of entanglement itself, that make the applications possible — and of course explore further consequences!
Although it was explicitly made only relatively recently (see Chapter 5 in Roger Colbeck’s 2009 Ph.D. thesis), the sole violation of any Bell inequality already has one striking consequence: any physical process whose input/output behavior generates said violation cannot by definition satisfy all three basic assumptions . Provided that we believe in free will (inputs in the experiment are chosen independently of the system’s internal state) and e.g. special relativity (as a way to enforce the no-signaling condition between the system’s two parts), then I claim that the physical process — whatever it is, quantum mechanical or not — must generate randomness on the fly. Indeed, any deterministic family of distributions (i.e. for all ) that is also no-signaling automatically satisfies assumption 3. of local realism, and hence cannot violate any Bell inequality.
Note how we have discovered a very simple test which if verified all but proves that the universe is intrinsically random, based on only very weak physical assumptions. (We do need to assume something, as in principle nothing prevents the whole universe from being fully deterministic.) I think this conclusion deserves to be much more widely known — as Avi Wigderson (almost) suggested recently, it should even be taught in high school! Of course we know (well, maybe not yet in high school) that the laws of quantum mechanics predict that the world is random: they claim that measuring the state in the basis results in each of the two possible outcomes being obtained with probability . But how can we test this prediction? Repeating the experiment as many times as desired and collecting statistics will only build confidence to a limited extent: if the outcomes are indeed random then any sequence has the same probability of being produced, and we do not have any a priori reason of being suspicious of whatever sequence we actually observe. Bell’s theorem goes much further by providing an efficient test according to which one can prove that the world is intrinsically random (within error bars…).
A sequence of recent works build upon Colbeck’s observation in different directions. Pironio et al. designed a protocol using which one could certifiably generate uniform bits, starting from only uniform bits (used to make a pseudorandom choice of inputs in successive experiments, as well as to “amplify” the quality of the randomness produced by applying a classical procedure called an extractor). They report an implementation of their protocol in which uniformly random bits were generated over the course of a month (with high probability :-)) — the first documented certified generation of fresh randomness in our universe! Pironio and Massar and Gelles et al. later expanded upon this work to demonstrate the possibility for an exponential expansion of randomness by performing measurements on two independent, un-entangled pairs of systems. The assumption that there is no entanglement in-between the two pairs seems hard to justify, and Vazirani and myself independently gave a protocol achieving exponential expansion with a single pair of devices. Our analysis provides the additional guarantee that the random bits generated are “secure against quantum adversaries” (more on that below). Recently, the task of “free randomness expansion” was introduced by Colbeck and Renner. The goal is to produce uniformly random bits starting only from weakly random ones, e.g. a so-called Santha-Vazirani source, from which it is known that it is impossible to extract uniform bits classically. Colbeck and Renner show that Bell’s theorem can also lead to the generation of a single certified uniformly random bit, starting from a large supply of Santha-Vazirani bits but without necessitating access to any prior uniform randomness. For further work on this topic see also a recent paper by Gallego et al.
2.2. Key distribution
Ekert in 1991 was the first to put forward the use of entanglement for a cryptographic task, key distribution. The goal in key distribution is for two trusted but distant users to establish a common shared string of bits, the key. For this they can communicate over a public channel, on which an adversary might be actively eavesdropping. Note the close similarity between this and randomness generation: in key distribution, the distant users wish to generate random bits that are correlated (both should have the same key) but independent from the eavesdropper — the so-called “quantum adversary” trying to guess information about the final key from the public information that she has access to.
The intuition behind Ekert’s protocol is the following. Since the discovery by Clauser et al. of their inequality~(2), researchers had been actively investigating which entangled states and measurements led to the largest possible violations of the inequality. Though it had not been rigorously proven, the “consensus” was that the optimal strategy seemed to be unique (up to unimportant local rotations), consisting in measuring an EPR pair using the specific measurements determined in Exercise 3. Ekert then suggested using the following protocol: Alice prepares EPR pairs in her lab, and sends half of each pair to Bob. Both users select a random subset of their EPR pairs on which they perform optimal measurements for the violation of (2). Provided they observe a value close to , they are able to guarantee that no adversary has tampered with the half-EPR pairs as they were flying from Alice to Bob: if this was the case then the violation would be further reduced. Once this has been established the users can measure their respective halves of the remaining EPR pairs using the same basis and thereby generate perfectly correlated random bits that will constitute their key.
The main idea is thus that by verifying the presence of certain strong correlations between their systems the honest users can verify that these systems are close enough to being an EPR pair, which, being a pure state, is un-correlated with any third system, and in particular with the adversary’s. Making this intuition precise is another matter, and required large amounts of work. In the course of establishing security it was realized that Ekert’s protocol could be thought of as a “purified” version of the Bennett-Brassard ’84 protocol, so that security of either implies security of the other. Many proofs of security — by Mayers, Shor and Preskill, Renner, Koashi among others — are now known. I am not too familiar with all of them, but my understanding is that they bypass the necessity of making the above-mentioned statement (maximal violation of (2) implies EPR pair) precise and rather establish directly the privacy of the generated bits. For this each proof approach required the development of new information-theoretic tools — uncertainty relations, CSS codes, quantum conditional min-entropy — which went on to play an important role of their own in quantum information theory.
The application of these tools, however, systematically relies on at least some amount of a priori characterization of the quantum systems generated by Alice, e.g. that they have local dimension , as should be the case of an EPR pair. While this may seem like a reasonable assumption (after all, Alice prepares the EPR pairs herself, so she should know what she is doing; it is only when the half-pairs go to Bob that the adversary can potentially interfere with them), it is not always easy to justify in practice: how does Alice know that her equipment indeed generates EPR pairs, and not some other system with additional degrees of freedom? Doing away with that assumption is at the heart of the challenge of device independence that we turn to next.
2.3. Device-independent key distribution
The intuition expressed in Ekert’s paper — that near-optimal violation of the CHSH inequality (2) could serve as a test for the presence of an EPR pair — was first made precise by Mayers and Yao in 1998. Mayers and Yao devised a simple test that Alice could perform within her own laboratory and which guaranteed that, not only whatever quantum state she produces, but also whatever the measurements she actually performs on that state, provided the correct statistics are observed then the state must be equivalent to an EPR pair. Note that the strength of the statement is that it does not rely on knowledge of either the state or the measurements themselves; if the measurements are trusted then basic tomography can be performed. The only assumptions required for the Mayers-Yao test to be sound are that quantum mechanics is correct and that Alice has a way of enforcing the no-signaling condition between the two halves of her system for the duration of the test.
Unfortunately the precise technical statement proved by Mayers and Yao is not robust: if the observed statistics do not exactly reproduce the expected ones then there are no longer any guarantees. More recent work successfully devised increasingly robust variants of the Mayers-Yao test, and we will explore them further below. These, however, came only much later, and for a few years the possibility of achieving device independence — key distribution protocols whose security can be established without any a priori assumptions on the quality or even trustworthiness of the quantum devices used — was left in a limbo; it seemed like the heavy technical machinery required was just not available yet.
The breakthrough came in work of Barret, Hardy and Kent (BHK) in 2005. The first observation made by BHK is that at its most basic level privacy of the users’ bits in an Ekert-like protocol can be stated as a very simple constraint on the kinds of tripartite distributions that are in principle allowed by quantum mechanics (or even, in their case, that are solely constrained by the no-signaling condition). Indeed, and putting aside the issue of any public communication between the users for a moment, privacy is simply the requirement that, for any set of distributions that could model the users’ interaction with their systems in the protocol — here represent Alice and Bob’s respective choice of inputs (measurement bases), the corresponding outcomes, and any outcome produced by an arbitrary third system, held by the adversary — it should be the case that is uncorrelated with or , whenever these bits are used as part of the key. That is, conditioned on any fixed (having non-negligible probability of being produced) it should still be the case that both and are close to uniformly distributed, and in particular cannot be predicted from .
BHK’s main insight is that privacy can be enforced automatically as a consequence of, (i) the observation of a sufficiently large violation of a specific Bell inequality between Alice and Bob’s systems, and (ii) the respect of the no-signaling constraint between any of the three systems. Note how unique this is: if we ignore condition (i), then nothing prevents that any correlation between Alice and Bob’s outputs (correlations which are required for them to obtain identical keys) would be solely due to a source of shared randomness that is part of the a priori description of their joint system; but if this is the case then the eavesdropper may as well have access to her own copy of , enabling her to reproduce the same correlations. In other words, in a classical world, nothing prevents the eavesdropper Eve from having a perfect copy of Bob’s system, in which case if Bob can share a key with Alice, so can Eve!
The fact that the additional assumption (i) can serve as a guarantee of the independence of or from is an instance of a more general phenomenon called monogamy. For the case of quantum states, monogamy can be understood as follows (see also this entertaining early overview by Terhal): there exists bipartite states that do not have a symmetric extension with respect to the system . In other words, such that there does not exist a state such that both reduced densities : system in cannot be copied. Monogamy in the general setting of no-signaling distributions was first put forward in a paper by Barrett et al. which appeared right before BHK; the latter built upon that paper by exhibiting a specific Bell inequality for which reasonable security guarantees could be obtained from the combination of (i) and (ii). The result is a protocol for key distribution whose security rests on a single assumption: that no signaling occurs between Alice, Bob, and the adversary’s respective laboratories — arguably the weakest possible condition for key distribution to be possible.
The BHK result opened the way for a flurry of works achieving improvements in terms of efficiency and security guarantees. A single major drawback of the BHK security proof nevertheless remained open for some time: security was only proven under the additional assumption that each use of their devices by Alice and Bob is independent, i.e. there is no signaling back and forth between successive uses (and in particular the devices are memoryless). At a basic level this assumption justifies the estimation of the Bell inequality violation by the devices through sequentially repeated measurements, but it is also crucial in reducing “general” to “collective” attacks of the adversary. This past year there were three independent proofs of security of device-independent key distribution protocols that removed this assumption, by Reichardt et al., Barrett et al., and Vazirani and myself. All three are based on very different approaches and I hope we can look at some of them over the course of the reading group.
2.4. Testing quantum mechanics
Can one test quantum mechanics? This fascinating question deserves (at least) a blog post by itself, and I will not delve on it here (see this paper by Aharonov and Vazirani for a stimulating introduction). Rather, let me simply indicate how another unique feature of entanglement, the rigidity to which I already alluded to above, can be used to control quantum systems by purely classical means (and thus presumably verify that they are accomplishing a prescribed quantum mechanical task).
We already saw that a bipartite system producing correlations yielding a value close to in (2) all but behaves like an EPR pair, in the sense that it could be used to generate identical random bits, guaranteed to be independent from any output generated by an arbitrary third system. Assume further that quantum mechanics is correct: can we make the strong conclusion that the system is close to an EPR pair? The answer is yes, and it was first proven by Mayers and Yao in the exact case and by Magniez et al. for the case when some error is allowed (technically, both results require a slightly extended variant of the CHSH inequality (2)). Very recently, Reichardt et al. succeeded in establishing an even stronger result: suppose the users sequentially repeat the same experiment with their devices a large number of times. Then among the interactions, there is at least a polynomial (e.g. ) fraction in which the devices must have been measuring independent EPR pairs. That is, at the start of the experiment, the devices’ state necessarily included a tensor product of a large number of EPR pairs.
That this is such a strong statement can be readily seen from the range of its possible applications. In particular, Reichardt et al. devise a scheme by which a classical polynomial-time “verifier” can coerce two arbitrarily malicious (but non-communicating) quantum “provers” into executing a quantum circuit of the users’ choice; and this is so even if the provers are infinitely powerful (but still bound by the laws of quantum mechanics). I will not go into the details, but the main idea is simple: by executing many tests for the violation of (2) by the provers, the verifier effectively checks that they share a large number of EPR pairs, and further perform measurements on their respective halves that are equivalent to the measurements prescribed for an optimal violation. Once such a handle is gained on the provers’ actions, it can be bootstrapped to force one prover to “check” the other prover’s actions by performing tomography. For this the verifier is leveraging the fact that either prover never knows whether it is being tested as part of a CHSH test, or whether it is actually executing some useful step of the verifier’s circuit: the tests force the provers to always stay on their guards.
An interesting question here is that of the minimal assumptions required in order to achieve such “delegated quantum computing”. Reichard et al. achieve it with two non-communicating provers. Previous work by Aharonov et al. showed how to do it with a single quantum prover, provided the verifier had the ability to store a constant number of qubits. Broadbent et al. improved this to a verifier whose only quantum capability is the ability to prepare one of a constant number of possible single-qubit states, and send them to the prover (in particular, only a single qubit needs to be stored at any time). What are the minimal assumptions required for the delegation, or verification, of specific quantum tasks executed by a capable but possibly malicious prover?