Foundations of randomness, Day 3

Last day! Wednesday started with a bang. Yevgeniy Dodis spent the whole Tuesday evening (our workshop dinner at Lanzerac!) complaining that we weren’t leaving him enough time to prepare for his talk…only to spend an equal amount of time during his talk complaining that he wasn’t going to be able to cover half of the material he had prepared! (To view things positively, this says something about the level of interaction during the talks — Yevgeniy was flooded with questions; quite a show of force for a post-workshop dinner morning.)

Dodis – Randomness in cryptography

I regret not having saved a picture of the table Yevgeniy drew on the whiteboard for his talk. There were four columns: no randomness, weak randomness, local or public randomness, and perfect randomness. For each column Yevgeniy listed four cryptographic scenario: soundness (for which classes of computational problems are there sound interactive proof systems), authentication (secure implementation of primitives such as message authentication or digital signatures), privacy (tasks such as bit commitment, secret sharing, or even public- or private-key encryption) and extraction (is deterministic extraction of uniform randomness possible). The scenario are listed in increasing order of difficulty, and the question is how far each kind of randomness lets us go down the list.

The first and last columns are straightforward to fill in. In the first, corresponding to a world with no randomness, there is very little one can do. In terms of soundness it is possible to verify languages in NP, but no more; cryptographic tasks are clearly impossible (determinism does not allow for uncertainty — at least for this talk!), and certainly extraction does not make much sense. If, on the other hand, perfect randomness is available many feats can be accomplished. It is a milestone of complexity theory that all languages in PSPACE have interactive proof systems; the hard work of cryptographers over the past couple decades demonstrates an impressive range of cryptographic tasks that can be implemented with or without computational assumptions. And of course extraction is not a problem.

Things become interesting with the second column. Suppose given access to a source of randomness taken from a family of sources which have a guaranteed min-entropy rate, but from which it is not possible to deterministically extract uniformly random bits (such as Santha-Vazirani sources). Can such a source still be used to implement non-trivial cryptographic tasks?

Here Yevgeniy drew our attention to an interesting distinction between authentication and privacy. Indeed, the latter is much more demanding in terms of the “quality” of the randomness used. For an adversary to break an authentication task, such as a message authentication code (MAC), it is required that the adversary be able to forge a certain object — e.g. a fake signature for an unknown message for which another valid signature is known. In contrast, for the adversary to break a privacy task, such as encryption, it is sufficient for the adversary to distinguish between two distributions — encryptions of {0} and encryptions of {1}. Thus at a high level it is not too surprising that it should be possible to implement authentication tasks in a way that is secure as long as the sources of randomness used to implement the task has a guaranteed min-entropy rate. Privacy tasks, on the other hand, seem to require much more structure from the random source. In fact, if one thinks about it the task of secure encryption is very closely related to that of extraction, and one may readily expect that privacy can only be achieved if the family of sources allows for deterministic extraction. This connection is made formal in a paper by Bosley and Dodis.

Returning to his second column, Yevgeniy introduced the notion of an {(t,\delta)}-expressive source, which informally is a class of sources such that for any two efficiently computable functions {f} and {g}, if {f} and {g} differ on a significant (at least {2^{-t}}) fraction of inputs (measured under the uniform distribution) then there is a source in the family such that the expected statistical distance, when evaluated on that source, between {f} and {g} is at least {\delta}. That is, a family is expressive if it “separates” pairs of functions that differ significantly. Now the main results are:

  1. Many common families of sources are expressive — including all families from which deterministic extraction is possible, but also e.g. Santha-Vazirani sources and various generalizations thereof.
  2. No privacy task can be achieved if the only available randomness is taken from an (a priori unknown) source from an expressive family. Proof sketch: consider an encryption scheme. Of course encryptions of {0} and encryptions of {1} should be different most of the time (otherwise decryption will fail). Therefore by the definition of an {(t,\delta)}-expressive family, for say {t=0} or {t=1}, there should be a source in the family under which both types of encryption can be efficiently distinguished — contradicting security.
  3. Some authentication tasks, including MACs and digital signature schemes, can be implemented from some specific expressive families, such as Santha-Vazirani sources or their generalization block min-entropy sources.

The proof of the second point above is trivial, but the result is quite interesting, highlighting the importance of making the right definitions! I am not familiar at all with this area, but it seems that the notion of expressive family of sources is a meaningful one to have in mind. Now I only wish Yevgeniy had time to say something about the third column (public vs private randomness)…unfortunately it will have to be for another time.

Fawzi – Exams

The following talk was given by Omar Fawzi, soberly entitled “Exams” (last day…it had to come!). Omar considers the following scenario. Suppose an exam consists of {n} possible questions {X_i}, and a student has some information, or knowledge, {B} based on which she attempts to answer the questions that are asked to her. Suppose the student achieves a certain score when asked about a random subset of {k} of the {n} possible questions — technically, given indices {\{i_1,\ldots,i_k\}} the student produces guesses {A_{i_1}(B),\ldots,A_{i_k}(B)} such that {A_{i_j}(B) = X_j} for a fraction {\Delta} of indices {j\in\{1,\ldots,k\}}, on average over the (uniformly random) choice of the {k} indices and the student’s randomness (including the random variable {B} and the map {A}).

Can we deduce that the student simultaneously “knows about” a fraction {\Delta} of all possible questions in the exam — if she was asked about all {n} possible questions at once, would she typically achieve the same score as when asked only {k} of them? To make this a little more precise and see why it is not obvious, consider the following simple scenario. The correct answers {X_i} are either all {0}, or all {1}, each with probability {1/2}. The student’s information {B} consists of two bits; the first bit {B_1 = X_i} (for all {i}), and the second bit {B_2} is an independent coin flip. The student’s strategy is to always answer {B_1\oplus B_2}. Note that this student is not “optimal”, as she has “good days” ({B_2=0}) on which she gets everything right, and “bad days” on which she scores {0}. Our task is not to extract an optimal student, but to reproduce this particular student’s behavior at the scale of the whole exam: we want a strategy that gets the whole exam right half the days, and the whole exam wrong the remaining time — not a strategy that gets half the exam right on every day! So you see that the most natural “rounding” procedure, which would decompose the set of {n} questions into {(n/k)} groups of {k} and look up the student’s answers to each group, will not work. One needs to take into account the randomness in {B} and in the student’s strategy to extract a global strategy that is consistent with the local behavior.

The scenario is reminiscent of what are known as “direct product tests” in the PCP literature, and there is a long sequence of works on the topic, see e.g. Dinur and Reingold, Impagliazzo et al. and Dinur and Steurer. As far as I could tell there are differences that make the problems incomparable: for instance, direct product tests attempt to match two consistent local views to a global object, whereas here the exam is guaranteed to pre-exist; direct product tests consider complete agreement instead of fractional agreement {\Delta} here; etc. But the problems are similar enough that you’d expect techniques used to approach the one could be useful for the other as well. For anyone interested, see Omar’s slides for a more precise formulation of the problem he considers.

The situation is even more interesting in the quantum setting. Suppose the exam remains classical (though one could consider quantum exams as well…), but the student’s information {B} is a quantum state. Then all we know is that for every set of {k} questions the student has a measurement on {B} that produces answers with a certain success rate. Can these measurements be combined in a way to provide answers to all questions in the exam simultaneously? Since any measurement will necessarily affect the student’s state we can’t simply repeatedly measure {B}, and monogamy prevents us from taking many copies of the state (which couldn’t all simultaneously be correlated with {X} in the appropriate way). The problem has a very similar flavor to the one faced when proving security of randomness extractors against quantum side information. Again there are subtle differences, but one may expect some of the techniques developed in that area (such as the work of Koenig and Renner on sampling-based extractors) to be useful here.


The day ended with two afternoon talks each highlighting a number of open questions in device-independent randomness certification, and some leeway that the speakers have recently achieved. This left us with a solid set of take-home problems to think about!

Pironio – Some open questions on DI randomness generation

The first talk was given by Stefano Pironio, who picked three of his favorite open questions in device-independent randomness, and had time to tell us about two of them in some detail.

Stefano’s first question asks whether it is possible to construct deterministic extractors for quantum sources. Although it is unclear whether this would have any application, it is a very natural question. The main observation is that the “raw” random bits that are produced in a device-independent randomness extraction procedure (say, the outputs of Alice’s device in a protocol which sequentially repeats the CHSH game) not only have a guaranteed min-entropy rate, but they also have additional structure that comes from the way they are produced. What this structure is, and whether it is sufficient to guarantee deterministic extraction, is precisely the content of the problem. Stefano showed us a specific inequality that he could show always holds, and hoped would suffice: if {A_i\in\{-1,1\}} is the {i}-th bit output by the source, then the average of all correlators

\displaystyle \frac{1}{2^n}\Big| \,\sum_{S\subseteq [n]}\, \big|\langle \Pi_{i\in S} A_i \rangle\big|\, \Big| \,\leq\, 2^{-\delta n},

for some small {\delta} depending on the CHSH violation observed. In fact Stefano claimed the inequality is already enough to show, via the probabilistic method, that a deterministic extractor should exist; thus it remains to find one…This is an appealing question, and I wouldn’t be surprised if it can be solved by the pseudorandomness experts out there!

The second problem Stefano presented has to do with the relation between randomness and entanglement. Suppose that the devices used in a randomness expansion procedure share {n} “ebits”, or entangled bits. How many bits of randomness can be generated? This is both of theoretical and practical interest: entanglement is hard to come by, and of course we want to get the most out of whatever we have. All known protocols generate {O(n)} random bits from {n} ebits, but there is no reason for this to be a hard limit; intuitively entanglement is required for testing the devices, but less directly so for generating randomness (see also the next talk by Acin). And indeed, Stefano showed us that it is possible to go further: he has a protocol which generates {n^{1+c}} randomn bits, for some small {c}, using {n} ebits.

The last question, on which we unfortunately did not have time to go into, is of obtaining guarantees on the randomness that hold against “post-quantum” adversaries, limited only by the no-signaling principle. This is a fascinating problem on which very little is known. There are no-go results for privacy amplification in this setting, but they only apply in rather stringent scenario; moreover they pose no immediate obstacle to guaranteeing entropy, rather than uniform randomness. The problem is challenging because the general non-signaling framework gives us very little to work with: a family of conditional distributions, but no such thing as a state, a measurement, a post-measurement state, etc.; very few tools are available. On the other hand it makes the problem mathematically very clean, almost “classical” (no more Hilbert spaces, no more quantum!). I only wish I had more intuition for it!

Acin –  How much randomness can be certified in quantum and general non-signalling theories?

In the last talk of the workshop Antonio Acin asked the question — “How much randomness can be certified in quantum and general non-signaling theories?” — and gave us some rather counter-intuitive answers. As already mentioned in Stefano’s talk, the prevailing intuition is that the amount of randomness that can be extracted is directly related to the amount of “quantumness” present in the devices, where “quantumness” would be directly related to entanglement, since entanglement is clearly the “non-classical” feature exploited here. But this need not be so: even though entanglement is necessary for certifying the device, it is not a priori required that it serve as a resource that gets depleted as more and more randomness gets produced.

As a first result Toni showed how more randomness could be extracted from a single measurement, on a single entangled EPR pair, by considering general POVM measurements instead of projective ones. It is known (and not hard to show) that making POVM measurements on an EPR pair can only yield at most one bit of randomness per site. Toni introduced a four-outcome Bell inequality such that a maximal violation of the inequality requires the use of a non-projective four-outcome measurement on one half of the EPR pair, and a state such that each of the four outcomes has equal probability {1/4} when the measurement is performed — thus yielding two uniformly random bits from a single qubit!

This is unexpected, but perhaps not groundbreaking: one bit, two bits, ok. But then Toni dramatically scaled things up. His second result is that by performing repeated measurements on a single, constant-size entangled state (I think in his scheme this is again just a pair of qubits) one could generate an infinite sequence of random bits, containing an amount of entropy which certifiably goes to infinity — provided the requisite Bell violation is guaranteed.

If you’re wondering about the relationship between this result and the one presented in the talk by Henry Yuen on infinite randomness expansion, there are some important differences. First of all Toni’s result applies under the assumption that the device maximally violates a certain Bell inequality, but there are no bounds on the accuracy required: thus in principle testing for a sufficiently-close-to-optimal violation could require an arbitrarily large number of runs; in Henry’s scheme the testing is fully accounted for. On the other hand Henry’s scheme requires the devices to have an unbounded number of EPR pairs available, whereas the main point of Toni’s scheme is that a single pair can be re-used over and over again.

The result is a proof of concept (another limitation is that the Bell inequality requires to perform a measurement on one of the devices that has an exponential, in the number of random bits produced, number of possible settings — the other device being measured sequentially using measurements with two settings), but I again find it rather unexpected. The whole area of device independence is turning out to be a great testbed for many of our (bad) intuitions about quantum mechanics and nonlocality!

Toni ended his talk with a third intriguing result, relating algorithmic randomness to information-theoretic randomness as we usually understand it. The main observation is the following. Suppose a classical Turing machine is used to prepare a sequence of qubits where each qubit is either initialized in one of the two possible computational basis states, or each qubit is initialized in one of the two possible basis states in the conjugate basis, the Hadamard basis. Then there is an algorithmic procedure that always succeeds in distinguishing which is the case! Note that the Turing machine may very well be producing a sequence that for all practical matters appears fully random, so that one can for instance show that no procedure which makes a single choice of basis and measures all qubits in that basis will be able to tell if it got the right basis or not. Instead the trick is to use both bases: measure half the qubits in the computational basis, and half in the Hadamard basis. In this case, one of the sequences obtained will be uniformly random for sure, and the other will be the sequence produced by the Turing machine — and as it turns out, it is possible to distinguish the two algorithmically. See the paper for more details, and further results along those lines.


The day ended with a very brief “open problems session”: Toni wrote down some of the questions that were raised during the workshop and we all commit to solve by the next iteration. I’ll leave you with a picture of the board. There are certainly many other questions we could ask, some of which are interspersed in these blog posts; feel free to suggest more!

Posted in Conferences, device independence, Quantum, Talks | Tagged , , , , , | Leave a comment

Foundations of randomness, Day 2

Back to the fundamentals of randomness and our workshop in Stellenbosch – I want to discuss the remaining two days of talks!

The second day started off with a whirlwind tour of the TCS approach to pseudo-randomness by David Zuckerman. David surveyed known constructions of pseudo-random generators and extractors; he also gave a very comprehensive overview of the different types of sources that have been considered in this area and what kind of extraction (deterministic, seeded, two-source) is known to be possible (or not) for each type of source.

I’ll refer you to David’s slides for details. Two particular sources caught my attention. First David mentioned small-space sources, for which there are good deterministic extractors but so far only in the regime of high (polynomial) min-entropy. This type of source seems quite relevant in practice, and it is natural to ask whether it is possible to do better using device-independent quantum extraction. How could we leverage the fact that the source is generated using bounded space? A second interesting category of sources are non-oblivious bit-fixing sources, which are sources such that some of the bits are uniformly random, and others deterministic, but such that which bits are of which type can be chosen adversarially by the source based on the values taken by previous bits. Together with Santha-Vazirani sources these are the two examples David gave of natural sources from which deterministic extraction is not possible. But we do know that it is possible to extract from a single Santha-Vazirani source, of bias arbitrarily close to {1/2}, using a device-independent quantum procedure. So what about non-oblivious bit-fixing sources — can it be done there as well?

Work on device-independent randomness extraction from the quantum information community has so far focused on a small set of sources — uniform seeds of limited length, arbitrary Santha-Vazirani sources, and more recently general min-entropy sources — but I see no reason why these are the only cases of interest; the restricted focus seems to be mostly for historical reasons. In pseudo-randomness different types of sources are often motivated by applications to derandomization, lower bounds, or combinatorial constructions. In quantum information, rather than “beating the classical” we should focus on the scenario that are best motivated by the relevant applications, which so far come mostly from cryptography. Here device-independent randomness certification procedures seem most relevant in their role as “decouplers” (see the discussion on “random to whom” from Valerio’s talk). What kind of structure are we willing to assume on the kind of information an adversary may have kept on a particular source? This question seems particularly relevant in light of recent works on device-independent two-source extraction.

David’s talk was followed with a talk by one of his students, Eshan Chattopadhyay, on their recent breakthrough construction of a two-source extractor for poly-logarithmic min-entropy. It is hard to do any justice to such an intricate (and beautiful!) result, and instead I will point you to Eshan’s slides or the great talk that David gave on the same topic on TCS+ just a few weeks ago.

For the last talk of the morning session Henry Yuen treated us to an (arguably :-)) even more formidable treat: infinite randomness expansion! Henry presented the main steps that led him and Matthew Coudron to their beautiful result showing how, starting from just a few uniformly random bits, one could bootstrap a quantum device-independent expansion procedure and generate as many uniformly random bits as one desires — yes, that’s an {\infty}! (I realize a bit late that I never properly explained what “device-independent” even means… if I have any “classical” readers left, for context for Henry’s talk see a blog post of Henry’s for all the required background; for the quantum crypto motivation see the viewpoint by Roger Colbeck; for the computer scientist perhaps the introduction to this paper is readable.)

In Henry and Matt’s protocol the length of the initial seed only needs to be at least some universal constant. The number of random bits available to start with will govern the security parameter (the distance from uniform) of the bits produced, but aside from that any small initial seed will do. The protocol uses two pairs of devices (those can be arbitrarily correlated, e.g. share randomness or quantum entanglement, but are not allowed to communicate directly) which take turns in expanding the current string of bits by an exponential factor. The key observation required to argue that the protocol works is that each expansion step, not only increases the number of bits available, but simultaneously “purifies” them, in the sense that the bits produced will have high entropy even conditioned on any information, classical or quantum, in the possession of the other pair of devices. Given that this other pair generated the input bits used for the expansion in the first place this is not a trivial observation, and is  essentially the content of the powerful “equivalence lemma” of Chung, Shi and Wu discussed in the post describing Valerio Scarani’s talk. The lemma guarantees that the bits produced by a device-independent procedure are random even from the point of view of an adversary who shares arbitrary correlations with the seed (including knowing it perfectly).

The result is quite impressive: if a few uniformly random bits are available, then it is automatically guaranteed that infinitely many just-as-good (in some respects even better) random bits can be generated! So much for our derandomization problems… The next step is to ask if uniformly random bits are even needed: how about weaker sources of randomness, such as Santha-Vazirani sources or general min-entropy sources? The task of device-independent extraction from the former was taken on by Colbeck and Renner (with much follow-up work); for the latter there is the work of Chung, Shi and Wu, which unfortunately still requires a number of devices that scales with the length of the source as well as the security parameter. More to be done!

The afternoon brought two more talks, with a very different focus but equally stimulating. The first was delivered by Ruediger Schack, who presented “The QBist perspective on randomness and laws of nature”. Ruediger’s talk followed the one by Carl Hoefer on the previous day in challenging our attitudes towards randomness and the meaning of probabilities, but Ruediger took us in quite a different direction. One of the points he made is that Bell’s theorem can be interpreted as giving us the following mutually exclusive alternatives about the world: either locality does not hold, and we must accept that far-away objects can have a nonlocal (instantaneous) influences on one another, or we insist on keeping locality — as Einstein did — but ascribe a different “meaning” to probabilistic statements, or predictions, that arise from the Born rule. (Note that in the first option in “nonlocal influence” I am including the kind of non-signaling “influence” which arises from measuring half of an entangled pair.)

This is where QBism — for “Quantum Bayenism” — comes in. According to Ruediger, the solution it offers to this conundrum goes through the assertion that, contrary to e.g. the Copenhagen interpretation, “probabilities are not determined by real properties”. This allows one to “restore locality” while not having recourse to local hidden variables (as Bohmian mechanics would) either. Thus in QBism “there is no Born rule”, in the sense that the rule does not describe an intrinsic probabilistic fact about the world; rather, probabilities are seen as the reflection of an agent’s subjective belief and have no objective existence. The fact that, if Alice measures her half of an EPR pair in the computational basis and obtains a {|0\rangle}, then Bob will with certainty observe a {|0\rangle} as well, if he measures in the same basis, is not a “fact” about their systems; rather it is a prescription for how a rational Bob should update his belief as to the outcomes of an experiment he could perform, were he to be informed of the situation at Alice’s side.

The main conceptual tool used to formulate this interpretation of probabilities is the Dutch book used as a guide to how an agent (observer) should update its probabilities. I admit I find this interpretation a bit hard to digest — I cannot help but think of betting as a highly irrational procedure, and in general I would be hard-pressed to bet on any precise number or probability; the Born rule makes very precise predictions and ascribing them to an agent’s subjective belief feels like a bit of a stretch. Quite possibly it is also a problem of language, and I am not an expert on QBism (even the “usual” Bayesianism is far from obvious to me). Ruediger’s slides are very clear and give a good introduction for anyone interested in digging further.

The last talk of the day was given by Carl Miller, on “The extremes of quantum random number generation”. Carl explained how two very different principles enabled him and Yaoyun Shi to derive two independent lower bounds, each of interest in a separate regime, on the amount of randomness that can be generated from a pair of quantum devices in the sequential scenario: the devices are used repeatedly in sequence and one is interested in the min-entropy rate of one of the device’s outputs, as a function of the average observed violation of (in this case) the CHSH inequality: as the violation increases to the maximum the rate will improve to 1, and the question is how high a rate can be maintained is the violation is be bounded away from the optimum.

Carl’s first principle of measurement disturbance leads to a good rate in the high error regime, i.e. when the observed violation deviates significantly from optimum, say it is around {0.8} (for an optimum of {\approx .85}). The general idea is that a violation of the CHSH inequality necessarily implies the use of “non-classical” measurements, which do not commute (in fact a maximal violation requires anti-commuting measurements), by Alice’s device. Two non-commuting measurements do not have a common eigenstate, hence whatever the state of the device at least one of Alice’s measurements will perturb it — the outcome of that measurement on the state cannot be deterministic, and randomness is produced.

Of course this is just the intuition, and making it quantitative is more challenging. In their paper Carl and Yaoyun formalize it by introducing a new uncertainty relation (see this very recent survey on the topic for much more) expressed in terms of Renyi {\alpha}-entropy for {\alpha > 1}. The proof relies on strong convexity of the underlying Schatten-normed space of matrices, and Carl gave some geometric intuition for the uncertainty relation. The use of {\alpha}-Renyi entropy is important for the inductive step; it allows to bypass the lack of good chain-rule like arguments for the min-entropy.

The second principle introduced in Carl’s talk is self-testing. This is the idea that a device demonstrating a close-to-optimal violation of the CHSH inequality must have a certain rigid structure: not only does it generate random bits, but the device’s outcomes must be produced in a very specific way. Specifically the device’s state and measurements must be equivalent, up to local isometries, to those used in the “canonical” optimal strategy for the CHSH game. This principle is well-known (see e.g. the paper by McKague for a proof), but usually one only expects it to be of use in the “tiny error” regime, i.e. when the observed violation is of order {opt - \varepsilon} for small epsilon. Carl manages to squeeze the techniques further, and by carefully fine-tuning the argument is able to get good bounds on the randomness produced when the violation is large but still a constant away from optimum, say {0.83} or so. In that regime the bound is better than the one obtained from the first principle; I’ll refer you to Carl’s slides for the precise curves.

After Carl’s talk, the mind full of new ideasIMG_0018-1024x683 we left for a little walk to the Lanzerac wine estate, where we were shown around the wine cellar — including the mandatory wine
tasting of course, with the highlight being the local pinotage — and finished off the day with a nice dinner on the terrasse: it’s late October in South Africa, summer is coming!

Posted in Conferences, device independence, Quantum, Talks | Tagged , , | Leave a comment

Nonlocal games and operator spaces: a survey

Over the past few months, together with Carlos Palazuelos I wrote a survey article on “Nonlocal Games and Operator Space Theory”, and we just uploaded it to the arXiv. The survey is an invited contribution to a special issue of JMP on Operator Algebras and Quantum Information Theory which will hopefully appear soon.

Exquisite(ly painful) memories

We had a lot of fun writing the survey. Teaming up with an expert in functional analysis behind probably more than half of the impressive range of results in quantum information theory that the “operator space approach” has so far led to gave me a great opportunity to solidify some of the mathematical material I have labored through in recent years.

It all started during a memorable summer of 2009 spent visiting Harry Buhrman’s group at CWI in Amsterdam. At the time the breakthrough by Perez-Garcia, Wolf, Palazuelos (yes, the same Palazuelos!), Villanueva and Junge demonstrating the existence of tripartite Bell inequalities with unbounded violation was all fresh. The result generated a lot of attention, first because it is a great and unexpected result (contrast with the fact that the maximum violation of bipartite inequalities is uniformly bounded, by Grothendieck’s constant), and second because the proof technique was completely novel, and seemed to rely crucially on “deep arguments from operator space theory”, a relatively young theory (barely younger than quantum information, but then time flows more slowly in mathematics) developed by operator algebraists starting with the work of Ruan in the late 1980s.

I spent probably a full month banging my head against the paper. After a month, I had gained a good grasp of just one thing: when they say deep arguments, they mean deep. Note the comment added for v2 of the paper on arXiv: “Substantial changes in the presentation to make the paper more accessible for a non-specialized reader”. You bet: that comment was already there in summer 2009, and I guarantee it didn’t make me feel any better! I printed the paper, borrowed all referenced books from the library, and tried. Then put it down, out of despair. Then tried again. Then put it down. Picked it up again. To the hell with it, put it down, tried to find my own proof. Failed. Back to the paper. It was a wild chase…I kept thinking I was starting to see how to go about things, only to fail — deeply.

Cutting a long story short, we did get a couple of things out of that summer. The paper by Perez-Garcia et al. [PGWPVJ] covers a lot of ground. The main result is an existential proof of a family of tripartite Bell correlation inequalities, or equivalently a family of three-player XOR games, such that the maximum advantage of quantum players sharing entanglement over classical players using only shared randomness, as measured by the ratio of the corresponding maximum biases (bias = deviation of acceptance probability from {1/2}), is unbounded: it grows with the size of the games. A second result proved in [PGWPVJ] is that achieving any unbounded violation requires the players to share a relatively complex entangled state: GHZ states {|\psi\rangle= \frac{1}{\sqrt{d}} \sum_i |i\rangle|i\rangle|i\rangle} (of any dimension {d}), arguably a natural generalization of bipartite maximally entangled states, can only lead to a violation bounded by a universal constant.

After much suffering we were able to obtain generalizations and simplifications of both results — stripping out virtually all the operator space theory! First, with Jop Briet, Harry Buhrman and Troy Lee we extended the second result and showed that the maximum violation remains bounded for a larger class of states, including what we called “clique-wise entanglement” — combinations of EPR pairs, GHZ states, etc., as well as “Schmidt states”, states of the form {|\psi\rangle = \sum_i \alpha_i |i\rangle|i\rangle|i\rangle} for arbitrary coefficients {\alpha_i}. While this may seem innocuous, it turns out the extension to Schmidt states solves (via a reduction proved in [PGWPVJ]) an open question in operator algebras from the 1970s! The observation won us a fancy-sounding publication (“All Schatten spaces endowed with the Schur product are Q-algebras”) in a fancy-sounding journal (“Journal of Functional Analysis”). The result also has an interesting, completely classical consequence for the direct sum problem: it implies that there exists a three-player XOR game {G} whose bias is bounded away from {1} by a constant, but the bias of the XOR game obtained by taking the direct sum of {k} copies of {G} (the game is played {k} times in parallel and the parity of all answers is checked against the correct parity) remains bounded away from {0}, for all {k} — a strong counterexample to parallel repetition (for the direct sum) of three-player games.

Second (and with a couple years’ more effort), with Jop Briet we managed to re-derive the main result of [PGWPVJ], again with some improvements: while still probabilistic our construction is explicit (we give a simple distribution on games such that the large violation holds with high probability), and the dependence of the violation on the game size as well as the entanglement dimension are greatly improved. Arguably our construction borrows its key ideas from whatever we could squeeze out of [PGWPVJ], but there are no more operator spaces! The most fancy tool used in the construction is a tail bound for second-order Gaussian chaos due to Latala.

The operator space connection

So, does operator space theory really provide a natural framework for the study of Bell inequalities, or does the heavy machinery systematically hide observations that could be obtained by easier means? While the range of results obtained by Perez-Garcia et al. using their techniques (including large violations for bipartite  (non-XOR) inequalities and applications to quantum Shannon theory) seems to indicate that the tools are useful, it does not preclude that similar results could be obtained without having to resort to such high levels of obfuscation.

Since then I have grown more and more fond of the approach, to the point where I would in some cases agree that it provides the “right” point of view on a question about Bell inequalities (or nonlocal games) — a conclusion I would never have been able to reach without the invaluable help of David, Carlos, Marius, and my co-authors in these endeavors, Jop and Oded, whom I gratefully acknowledge! An example where I would say that the operator space viewpoint is undeniably the “right” one arises in work with Oded Regev on “Quantum XOR games” that grew out of our exploration of non-commutative extensions of Grothendieck’s inequality. For me a highlight of this work has been a derivation of the operator space Grothendieck inequality based on elementary techniques strongly inspired by the use of embezzlement states in quantum nonlocality.

These results, and many more, are described in the survey. While the JMP special issue is targeted at mathematicians, we tried to make the survey as self-contained as possible, and my hope is that it provides an easily approachable introduction to the delights of tensor norms, completely bounded maps and the associated “o.s.s.” for the interested quantum information theorist with little or no background in {C^*}-algebras or operator space theory.

An example

I’d like to end with a bird’s-eye view of the operator space point of view on entangled games by taking an example from the survey, an upper bound on the largest possible ratio between the maximum success probabilities achievable by quantum entangled and classical players respectively in a two-player game, as a function of the number of answers per player in the game. For a formal treatment I refer to Proposition 4.5 in the survey; here I will stay at the level of symbols, hoping to carry some of the flavor across, but please keep in mind that the exposition below is not fully accurate.

Games as tensors and values as norms

The functional analytic approach views a game as a tensor {G} on the space {({\mathbb C}^N\otimes {\mathbb C}^K)\otimes ({\mathbb C}^N\otimes {\mathbb C}^K)}. Here {N} and {K} are respectively the number of questions and answers to and from each player, and each coefficient {G_{x,y}^{a,b}} of the tensor corresponds to the associated payoff to the players in the game. The main thrust of the approach is that different value we are interested in — the classical value, or maximum success probability of classical players in the game, and the entangled value, or maximum success probability of quantum players sharing entanglement — are directly related to the norm of {G} when the underlying spaces are normed in the appropriate way, as Banach spaces (for the classical value) or operator spaces (for the entangled value).

Let’s consider the classical value first. To capture it we’ll turn the vector space {({\mathbb C}^N\otimes {\mathbb C}^K)\otimes ({\mathbb C}^N\otimes {\mathbb C}^K)} into the Banach space {\ell_1^{N}(\ell_\infty^{K})\otimes_\epsilon \ell_1^{N}(\ell_\infty^{K})}. What does this mean? Let’s first focus on one of the factors, {\ell_1^{N}(\ell_\infty^{K})}. We’re using the {\ell_\infty} norm for the space {{\mathbb C}^K} associated to answers, and the {\ell_1} norm for the space {{\mathbb C}^N} associated to questions. This choice is consistent with the interpretation of the game tensor as an object dual to the player’s strategies: glossing over considerations of non-negativity, a (classical) strategy is a collection of numbers {\{f_x^a\}} such that

\displaystyle \big\|\{f_x^a\}\big\|_{\ell_\infty^N(\ell_1^K)} = \max_x\,\sum_a |f_x^a| \leq 1.

Once a norm has been defined on the spaces associated with each player’s strategy, it remains to put them together into a norm on the full game tensor, which lives on the tensor product of the two spaces. In general (including here) there will not be a unique way to norm the algebraic tensor product of two Banach spaces in a way that is consistent with the two Banach norms. (Indeed, the study of all possible such norms was pioneered by Grothendieck.) For our purposes it turns out that the smallest such norm, the injective norm {\|\cdot\|_\epsilon}, is the right choice: the norm {\|G\|_{\ell_1^{N}(\ell_\infty^{K})\otimes_\epsilon \ell_1^{N}(\ell_\infty^{K})}} exactly coincides with the maximum success probability of classical players in game {G}. This is an easy calculation from the definition of the injective norm, but I’ll skip it here.

So that’s for classical strategies. Quantum strategies are a bit more complicated: instead of sequences of numbers {\{f_x^a\}} they are represented by sequences of (positive semidefinite) matrices {\{A_x^a\}}, of any dimension, such that {\sum_a A_x^a = \mathbb{I}} for all {x}. This is where operator spaces come in. An operator space is a structure that can be built on top of a Banach space by defining not just one norm but a sequence of matrix norms, on matrices of arbitrary dimension {d=1,2,\ldots,} with entries in the Banach space. Here the most natural operator space structure on {\ell_\infty^N(\ell_1^K)} replaces {\ell_\infty^N} by {M_N} ({N\times N} matrices with the operator norm) and {\ell_1^K} by {S_1^K}, whose operator space structure is a bit hard to define directly but can be obtained naturally via duality with {M_K} ({S_1^K} does not correspond to the Schatten-{1} norm; note that here we need an operator space, which provides much more structure than a Banach space). Once again I’ll skip the details: you’ll have to believe me that the resulting operator space structure, that I’ll continue writing as {\ell_1^N(\ell_\infty^K)}, precisely captures the notion of quantum strategy, i.e. a family of POVMs. (The “precisely” here is somewhat blunt; in particular there are subtleties involved with e.g. dealing with the requirement that POVM elements be positive semidefinite, but it is possible to work around those.)

Next we need to norm the tensor product, again as an operator space. As it turns out, the injective norm that worked so well for classical strategies has a direct operator space analogue, the minimal norm, which perfectly captures the quantum value of the game tensor! (I won’t show why, but just as for the injective norm it’s a simple calculation from the definition.) I’ll write {\|G\|_{\ell_1^{N}(\ell_\infty^{K})\otimes_{min} \ell_1^{N}(\ell_\infty^{K})}} for this norm, and from now on it’ll serve as a proxy for the entangled value of the game {G}.

That’s a lot of “coincidences”, isn’t it? A norm on the tensor product of Banach spaces matches the classical value of a game tensor, while the natural operator space analogue of that norm matches the quantum value. Now you may start to see why someone well-versed in the theory of such norms might see them as the “right” way to look at classical, and quantum, strategies. More details on this point in Section 2 of the survey.

A bound on the largest violation

But let’s move on. With the basic normed spaces in place the task of relating the classical and entangled values of a game can be formulated as follows: it amounts to estimating the norm of the identity map (as an aside, almost all results in functional analysis, or at least those that connect to the study of nonlocal games, that I have encountered seem to reduce to studying the “norm of the identity”… a fact that still makes me smile every time I stumble upon it: why isn’t the identity trivial?)

\displaystyle id:\,\ell_1^{N}(\ell_\infty^{K})\otimes_{\epsilon} \ell_1^{N}(\ell_\infty^{K})\rightarrow \ell_1^{N}(\ell_\infty^{K})\otimes_{min} \ell_1^{N}(\ell_\infty^{K}).

Indeed, as we have seen the norm of the game tensor when seen as an object on the first space corresponds to its classical value, while the norm of the same tensor considered as an object on the second space matches its entangled value. Any upper bound on the norm on the identity is an upper bound on the largest (multiplicative) advantage that can be gained by quantum players, while any lower bound will imply the existence of a game for which the advantage can be just as large.

Our strategy for bounding the norm of {id} is to decompose it as a sequence of three simpler maps, on which it is easy to obtain norm estimates. We’ll start from the right-hand side, and consider first the arrow

\displaystyle \ell_1^{NK}\otimes_{min} \ell_1^{NK}\rightarrow \ell_1^{N}(\ell_\infty^{K})\otimes_{min} \ell_1^{N}(\ell_\infty^{K}).

If the map is taken to be the identity, the condition {\max_{x} \|\sum_a A_x^a\|\leq 1} only gives us {\max_{x,a} \|\sum_a A_x^a\|\leq 1}, thus the norm of the identity is {1}. We can do better: consider the Fourier transform

\displaystyle \mathcal{F}: \,\{A_{x}^a\}_{x,a}\mapsto \Big\{ E_{x,k} = \frac{1}{\sqrt{K}}\sum_a e^{\frac{-2i\pi ka}{K}} A_{x}^a\Big\}_k, x=1,\ldots,N.

It is not hard to verify that if for every {x}, {\{A_x^a\}_a} is a POVM then for every {x,k} it will hold that {\| E_{x,k} \|\leq K^{-1/2}}: accounting for both players we saved a factor {K^{-1}}.

The next arrow we consider is

\displaystyle \ell_1^{NK}\otimes_{\epsilon} \ell_1^{NK}\rightarrow \ell_1^{NK}\otimes_{min} \ell_1^{NK}.

This is a key step: we are switching from the {\epsilon} to the {min} norm, from classical to quantum. But note how now the spaces are just {\ell_1}: only questions, no answers — but there are signs, and what we are looking at is precisely the setting of XOR games, i.e. Grothendieck’s inequality! Therefore for this map we can simply take the identity, and its norm will be at most Grothendieck’s constant {K_G}.

Finally, for the last arrow we consider

\displaystyle \ell_1^{N}(\ell_\infty^{K})\otimes_{\epsilon} \ell_1^{N}(\ell_\infty^{K})\rightarrow \ell_1^{NK}\otimes_{\epsilon} \ell_1^{NK}.

In order for the three arrows to compose to {id} we should take this map to be the inverse of the Fourier transform {\mathcal{F}} used in the first step. But note that now we are working with the {\epsilon} norm, or in other words classical strategies. Thus it suffices to bound

\displaystyle {\max_x\sum_a K^{-1/2}|\sum_{k} e^{\frac{2i\pi ak}{K}} t_{x,k}|}

for {t_{x,k}} such that {\max_{x,k} |t_{x,k}|\leq 1}, and by Parseval this is easily seen to be at most {K}. Hence our first map has norm {K^2}.

There we are! We managed to write the identity {id}, which maps the game tensor from an object acting on classical strategies to one acting on quantum strategies, as the composition of three maps with norms at most {K^2}, {K_G} and {K^{-1}} respectively. As a direct consequence, we’ve proven that the largest violation achievable by quantum strategies is at most {K_G\cdot K}, i.e. it is at most a constant times the number of possible answers in the game (warning: some constant factors missing, from e.g. glossing over complex vs real vs non-negative issues in the above).

What I like about this proof is that, from the point of view of operator spaces the above sketch is very natural (at least if I am to believe Carlos…), almost an exercise: as already mentioned, estimating the norm of the identity from one space to the other is the bread and butter of functional analysts, and they certainly have more than one trick up their sleeve. Unwinding the argument to frame it as a rounding procedure that transforms quantum entangled strategies into classical ones reveals some surprising tricks (e.g. the Fourier transform, which is used here to perform the actual quantum-to-classical rounding at the optimal step, the middle arrow) that one might not have thought of based solely on the usual “quantum toolbox”.

I’ll refer you to the survey for much more! All comments are very welcome and much appreciated; even typos: there is still plenty of time to fix those before the survey makes its way to the JMP presses.

Posted in Publications, Quantum, Science | Tagged , , | 3 Comments

Foundations of randomness, remainder of Day 1

Today I will cover the remaining talks from the first day of the workshop. Valerio’s opening talk was followed by a stimulating blackboard talk by Renato Renner, who asked the question “Is the existence of randomness an axiom of quantum physics”? Randomness is usually viewed as entering quantum mechanics through the Born rule, which is stated as an axiom describing the outcome distribution of any given measurement on a quantum state; there is no standard derivation of the rule from other axioms. This is deeply unsatisfying, as the rule appears completely arbitrary, and indeed many attempts have been made at deriving it from more basic structural constraints such as the Hilbert space formalism underlying Gleason’s theorem, or assumptions based on decision theory (Deutsch) or logic (Finkelstein).

Renato and his student Daniela Frauchiger have a proposal whose main thrust consists in demarcating what could be called the “physical substrate” of a given theory (such as quantum mechanics) from its “information-theoretic substrate”. In short, the physical substrate of a theory governs which experiments are “physical”, or possible; it is a completely deterministic set of rules. The information-theoretic substrate provides an additional layer which considers assignments of probabilities to different experiments allowed by the physical substrate, and specifies rules that these probabilities should obey.

Let me try to give a little more detail, as much as I can recall from the talk. Let’s call the framework suggested by Renato the “FR proposal”. The first component in the proposal is a definition of the physical substrate of a theory. In the FR proposal a theory bears on objects called stories: a story is any finite statement describing an experiment, where “experiment” should be interpreted very loosely: essentially, any sequence of events with a set-up, evolution, and outcome. For instance, a story could be that a pendulum is initialized in a certain position, left to swing freely, and after time {t} is observed and found to be in a certain position. Or it could specify that a quantum particle is initialized in a certain state, say {\ket{+}=\frac{1}{\sqrt{2}}(\ket{0}+\ket{1})}, then measured in the computational basis {\{\ket{0},\ket{1}\}}, and that the outcome {\ket{0}} is observed. Thus a story refers to a physical system and makes claims about observable quantities of the system.

Thus the physical substrate of the theory is a set of rules that completely determine which stories are “physical”, i.e. allowed by the theory, and which are not. For instance, the second story described above is a valid story of quantum mechanics: it is certainly possible that the outcome {\ket{0}} is obtained when the measurement described is performed. On the other hand quantum mechanics would rule out a story such as “the state {\ket{0}} is measured in the computational basis and the outcome {\ket{1}} is obtained”: this is simply not allowed. Note that up to this point the theory makes no claim at all on the “probability” of a story occurring; it just states whether the story is possible or not.

So much for the physical substrate. How do probabilities come in? This is the goal of the information-theoretic substrate of the theory. To introduce probabilities we consider mappings from stories to {[0,1]}. Any such mapping can be thought of as a measure of an observer’s state of knowledge, or uncertainty, about any particular story (much more on this and other interpretations of probabilities will be discussed in later talks, such as Carl Hoefer’s coming up next). A priori the mapping could be arbitrary, and the role of the information-theoretic substrate of the theory is to specify which mappings are allowed, or “reasonable”. For instance, in the measurement example from earlier most observers would assign that particular story a probability of {50\%}, which is what the Born rule predicts. But without taking the rule for granted, on what basis could we assert that {50\%} is the only “reasonable” probability that can be assigned to the story — why not {10\%}, or even {100\%}?

The main claim that Renato put forward is that the Born rule can be derived as a consequence of the following two broad postulates: (I apologize for the description of the postulates being rather vague; there are points about the formulation that I am unclear about myself!)

  1. The repetition postulate: experiments can be repeated, and if so lead to the same outcomes.
  2. The symmetry postulate: certain experiments have symmetries under which the outcome of the experiment is invariant. For example, an experiment which involves flipping three different coins in sequence and reporting the majority outcome is not affected by the order in which the coins are flipped.

The idea is that these postulates should be much easier to accept than the Born rule: certainly, if we want to say anything meaningful about certain probabilities being reasonable or not we ought to enforce some form of compatibility rules, and the two above sound rather reasonable themselves. The first seems a prerequisite to even make sense of probabilities, and the second follows the tradition started by de Finetti of basing the emergence of probability on indistinguishability of the outcomes of an experiment under certain symmetries of the experiment.

Unfortunately there does not yet seem to be a preprint available that describes these ideas in detail; hopefully it is coming soon. I’m quite curious to see how the proposal, that I find rather appealing, can be formalized.


After Renato’s talk — and an excellent lunch — we were treated to an intriguing talk on “Chance laws and quantum randomness” by Carl Hoefer. The talk addressed questions that we computer scientists or physicists rarely discuss openly (and indeed many of us might reserve for dreamy Sunday nights or inebriated Friday evenings). In Carl’s own words: “When we ascribe a primitive (irreducible) chance propensity, or postulate intrinsically random/chancy fundamental laws of nature, do we know what we’re saying?”. Or, less poetically put (but still in Carl’s words — Carl, thanks for making it accessible to us!): What does a claim such as “{\Pr(\text{spin-z-up} | \text{spin-x-up-earlier}) = 0.5}” mean, and how is it different from “{\Pr(\text{spin-z-up} | \text{spin-x-up-earlier}) = 0.7}”?

Let the question a chance to sink in — a good answer is not that obvious! It already arose in Renato’s talk, where probabilities were introduced as the reflection of an agent’s subjective belief. Well, this is the first of a number of possible answers to the question that Carl took head on and debunked, one slide at a time. (I am not sure how much I agree with Carl’s arguments here, but at least he showed us that there was certainly room for debate!) Other possible answers considered by Carl include probability as the reflection of an objective frequency for the occurrence of a certain event, or simply a “primitive fact” that would require no explanation. Needless to say, he found none of these interpretations satisfactory. It’s a good exercise to find arguments against each of them; see Carl’s slides for elements of answer (see also the article on Interpretations of probability on the Stanford Encyclopaedia of Philosophy for an extensive discussion).


A Galton board

So what is the speaker’s suggestion? I doubt my summary will make it much justice, but here goes: according to Carl probabilities, or rather “chance laws”, arise as a form of statistical averaging; in particular probabilistic statements can be meaningful irrespective of whether nature is intrinsically deterministic or irreducibly probabilistic. Specifically, Carl observes that many physical processes tend to generate a distribution on their observable outcomes that is robust to initial conditions. For instance, a Galton board will lead to the same Gaussian distribution for the location of the ball, irrespective of the way it is thrown in (except perhaps for an exceptional set of initial conditions of measure zero). Whether the initial conditions are chosen deterministically or probabilistically we can meaningfully state that “there is {x\%} chance that the ball will exit the board at position {z}”.

Carl concluded his talk by tying it to the Bohmian interpretation of quantum mechanics — almost, though perhaps not quite, going as far as to frame deterministic Bohmian mechanics as the most reasonable way to interpret probability in quantum mechanics. I found the discussion fascinating: I had never considered Bohmian mechanics (or, for that matter, the question of the origin of probabilities in quantum mechanics) seriously, and Carl made it sound, if not natural, at least not wholly unreasonable! In particular he made a convincing case for the possibility of ascribing meaning to “probability” in a fully deterministic world, without having to resort to the need for “subjective beliefs” or related rabbits such as high sensitivity to initial conditions — indeed, here it is quite the opposite.


The last talk of the day was delivered on the whiteboard, by Marcin Pawlowski. Marcin gave an intuitive derivation of the quantum bound on device-independent randomness generation from the CHSH inequality, the famous (at least for those of us versed in these things… see Appendix A.1 here for a derivation)

\displaystyle H_\infty(A|X)\geq 1-\log_2\Big(1+\sqrt{2-\frac{I^2}{4}}\Big) ,

based solely on a monogamy bound due to Toner. Toner introduced a “two-out-of-three” variant of the CHSH game in which there are three players Alice, Bob and Charlie, but the game is played either between Alice and Bob, or Alice and Charlie, chosen uniformly at random by the referee. Although in principle the maximum success probability of the players in this game could be as high as that of the two-player CHSH (since in the end only one instance of the game is played, the third player being ignored), Toner showed that the fact that Alice did not know a priori whom of Bob and Charlie she was going to play with places strong constraints on her ability to generate non-local correlations with at least one of them. Quantitatively, he showed that the players’ maximum success probability in this game is precisely {3/4}, whether the players are allowed entanglement or not: the presence of the third player completely eliminates the quantum advantage that can be gained in the usual two-player CHSH game!

The derivation presented by Marcin is simple but enlightening (good exercise!). It could be considered the “proof from the book” for this problem, as it makes the intuition fully explicit: the reason Alice’s outcomes are unpredictable, from the point of view of Eve (=Charlie), is that they are strongly correlated with Bob’s outcomes (up to the Tsirelson bound); monogamy implies that correlations with Eve’s outcomes could not simultaneously be as strong.


Typical view from a Stellenbosch winery

That’s it for Day 1 – time to relax, enjoy a good bottle of local wine, and get ready for the second day!

Posted in Conferences, device independence, Quantum, Science, Talks | Tagged , | 1 Comment

Foundations of randomness, Day 1: Scarani

As mentioned in the previous post, a highlight of the project I participated in over the past five weeks was the three-day workshop that we organized at STIAS towards the end of October.

The purpose of the workshop was to bring together a small group of computer scientists, physicists, mathematicians and philosophers around the broad theme of “the nature of randomness”. Each of our respective fields brings its own angle on the problem, with a specific language for formulating associated questions, judging which are most important, and a set of techniques to approach them. In computer science we use entropy, complexity theory and pseudo-randomness; physicists have access to randomness through the fundamental laws of nature; philosophers introduce subtle distinctions between probability, chance, and intrinsic or subjective randomness. The presence of any such framework is a prerequisite to progress: it enables to formulate research problems, develop appropriate methodologies, and obtain results. But any framework also sets implicit boundaries and runs a risk of turning into a dogma, according to which “valid” or “important” questions and results are judged. From my perspective, if there is any respect in which I would dare deem the workshop a success it is precisely in the barriers it helped bring down. The wide-ranging talks and participants forced me to take a broad perspective on randomness and question some of the most deeply ingrained “certainties” that underlie my research (such as a firm — though baseless — belief that the Copenhagen interpretation is the only reasonable interpretation of quantum mechanics worthy of a practicing scientist). This questioning was only made possible thanks to the outstanding effort put in by all participant to deliver both accessible and stimulating talks as well as to actively engage with each other; I cannot thank them enough for making the workshop such an enlightening and interactive experience.

It seems impossible to do any justice to the ideas developed in each of the talks by discussing them here, and in doing so I am bound to misrepresent many of the speaker’s thoughts. For my own benefit I will still make an attempt at listing a few “take-away” messages I wish to keep; obviously all idiocies are mine, while any remaining insight should be attributed to the proper speaker.

The opening talk was delivered by Valerio Scarani, who started us off with a personal take on the history, and future, of “Quantum randomness and the device-independent claim“. The “irreducible presence” (much more on this “irreducibility” will be discussed in later talks!) of randomness in quantum mechanics is usually attributed to the Born rule, which states that the probability of obtaining outcome {x} when measuring state {|\psi\rangle} in a basis containing a unit vector {|e_x\rangle} labeled by {x} is given by the squared modulus of the inner product {\langle e_x |\psi\rangle|^2}. Valerio pointed out the (well-known, but not to me) fact that the introduction of this precise quantitative formulation of the rule came almost as an afterthought to Born; indeed the use of the square appears as a footnote to the main text in Born’s paper.


Extract from Born’s paper

(Incidentally, see this paper by Aaronson for interesting consequences of instead using a {p}-th norm for {p\neq 2} in defining the rule.) I was somewhat puzzled by this: Born’s paper is dated 1926, a time when quantum mechanics was already a well-established theory, on which scores of physicists relied to do what physicists do — make calculations, formulate predictions, perform experiments, and repeat! But how can one make predictions if there is no rule governing which outcomes are to be obtained? My confusion likely stems from a basic misrepresentation of how physicists work (as may already be clear from the preceding description!); after clarifications I can offer two explanations: First, quantum mechanics was mostly used to compute properties of eigenstates of a given Hamiltonian, for whom properties which commute with the Hamiltonian always take on determined values, so that no probabilities are needed. Second, it is quite rarely the case (or at least was in the 1930s) that the outcome distribution of a fine-grained measurement on a single quantum mechanical system is accessible to experiment; instead only statistical properties of larger systems — such as the energy — are accessible, and the law giving the expectation value taken by a particular observable, {\langle\psi|O|\psi\rangle}, can be stated and used without bothering as to what it would mean to “measure in the eigenbasis of {O}” and obtain this or that measurement outcome: only the average has any experimental relevance. In any case, to close this parenthesis it is interesting to note that even as we now think of probabilities, and the “squared-modulus rule”, as what makes the specificity of quantum mechanics, the success of the theory initially had (and probably still has) little to do with it. In fact, it may have had quite the opposite effect, and induced a serious amount of confusion…

Indeed Einstein’s immediate reaction to Born’s paper is immortalized in his famous letter to Born from November 1926:

Quantum mechanics calls for a great deal of respect. But some inner voice tells me that this is not the right track. The theory offers a lot, but it hardly brings us closer to the Old One’s secret. For my part at least, I am convinced He doesn’t throw dice.

And thus started one of the most obscure periods in human thought… A famous anecdote along the route (though again, this was news to me) is given by von Neumann’s valiant attempt at establishing a rigorous “no-go” theorem for “hidden variables”, thereby proving the unavoidable resort to some form of uncertainty. Unfortunately it turned out that von Neumann’s theorem placed unreasonable assumptions on the hidden variable models to which it applied, making it all but inapplicable. Pressing ahead, skipping over Einstein, Podolsky and Rosen’s unfortunate account of (non-)locality in quantum mechanics, it is only with Bell’s 1964 paper demonstrating that the existence of local hidden variable theories for quantum mechanics could be experimentally tested that the debate was finally placed on firm — or, at least, in principle decidable — grounds. Indeed the major contribution of Bell’s work was to establish observable consequences for the existence of any hidden variable model for quantum mechanics — under the important assumption, to be further discussed in the talk by Ruediger Schack, that the model be local. Bell’s result had an immediate impact, prompting one of the most interesting experimental challenges in the history of physics. This is because there was no consensus as to what the outcome of the experiment “should” be, and proponents of a breakdown of quantum mechanics (at least in the regime concerned in Bell’s test and its subsequent refinement by Clauser et al.) stood on an equal footing with advocates of fundamental laws of nature that are non-local and probabilistic (of course either camp was dwarfed by the much larger number of physicists who simply considered the question irrelevant for the practice of their work… luckily for the progress of science, which would not have benefited much from stalling for two decades). And indeed early experiments went both ways, with “conclusive” results pointing in either direction obtained on the West and later East coasts of the US. The quest was brought to a climactic conclusion with Aspect’s experiments in Orsay in 1981-82, which conclusively handed the podium over to the Bell camp (how conclusively, however, you can judge from the amount of — justified — attention generated by the most recent experiments in Delft).

The history is fascinating (anyone can suggest a good, entertaining, informed and opinionated account?, but we should move on. Following the speaker’s lead, let me scoot ahead to modern times and the birth of device independence. This is usually attributed to the 1998 “UFO” by Mayers and Yao, who were the first to make a fundamental observation: that a complete quantum mechanical description of a black-box device, including its state and the measurements it performs on the state, can, in some cases, be inferred from the classicalinput-output measurement statistics it provides — provided one is willing to make one structural assumption on the device: that it has a certain bipartite structure, i.e. is composed of two systems on which certain (unknown) measurements can be independently performed (in their paper Mayers and Yao call this a “conjugate coding source”). The reason I qualify this result of “UFO” is that, although the motivation behind it is clear (obtain a proof of security for BB’84 that would be robust to imperfections in Alice’s photon generation device), the technique has no precedent. It is certainly natural to attempt to certify Alice’s device by performing some kind of tomography, but from there to seeing how a bipartite structure could be leveraged to obtain the result in such strong terms! (Note that, without a priori information on either the state or the measurements being performed, “tomography” cannot lead to any meaningful conclusion.) Certainly the authors must have been influenced by their intense ongoing work in proving security of QKD, and the intuition obtained from the security proofs. Still — quite a feat!

A second respect in which the Mayers and Yao paper can be qualified of “UFO” is that in spite of the conceptual breakthrough that we now recognize it for it was all but ignored following the years after its publication. One reason may be that it was published in the most obscure of conferences (proceedings of FOCS, anyone?). A more reasonable explanation is that the math in the paper is hard to follow: the notation is rather heavy and little intuition is given. From a QKD practitioner’s point of view, this is completely foreign territory; the techniques used bear very little resemblance to the techniques used at the time to prove security of QKD protocols and analyze their performance. Finally the result does suffer from one important drawback, which is that it is non-robust: Mayers and Yao were only able to characterize devices which perfectly reproduce the required correlations, but did not give any statement that tolerates {\varepsilon}‘s.scarani

The second birth (in terms of explosion of attention and interest) of device independence can be attributed to the 2005 paper by Barrett, Hardy and Kent. Although I cannot speak for them my impression is that the main motivation for their work was foundational; indeed the central question they ask is the extent to which the correlations that are checked by the users in all QKD protocols are sufficient by themselves: that is, can quantum mechanics be thrown out, and security derived only from these correlations? After all, from the point of view of the users the protocol is completely classical, and its security can also be defined classically, as the maximum probability with which an eavesdropper can “guess” the key that is eventually produced. Thus even if an implementation of the protocol may rely on quantum mechanical devices, its mode of operation, and security, can be formulated purely using statements about classical random variables. Barrett et al. showed that security could indeed be established in a model that is broader than quantum mechanics, assuming only the existence of joint probability distributions describing the input-output behavior of the devices and a potentail adversary, and the assumption that these distributions satisfy the no-signaling principle: outputs of either of the three systems do not depend on inputs chosen by any other party. Their work builds upon a strong line of research in non-locality (e.g. the introduction of the famous “PR boxes” by Popescu and Rohrlich a decade earlier), and it had one immense merit, which helps explain its impact: by completely “stripping out” all the messy details that obscured existing security proofs for QKD they managed to restore the intuition behind the security of QKD (see Ekert’s paper for a beautifully succinct formulation of that intuition), showing how that intuition could be formalized and lead to an actual proof of security — and this, under arguably stronger terms than the existing ones! Thus the link between Bell violation and security was finally expressed in its purest form. The beautiful paper was immediately followed by an intense line of works exploring its consequences, deriving device-independent proofs of security for QKD which achieved better and better rates under weaker and weaker assumptions (for instance on noise tolerance, or on the types of attacks allowed of the adversary). This is a very long story, with much more to say (check out the fourth slide from Valerio’s talk!), but once again it is time to move on.

In the second part of his talk Valerio gave us his four main “concerns” regarding device independence. In Valerio’s opinion these “concerns” are important assumptions that underlie device-independent cryptography — assumptions that are reasonable as they are recognized as such, but could undermine the field if not made explicit and discussed upfront.

The four concerns are the following:

  1. No-signaling (the physicist’s concern): It is usually claimed that device-independent protocols are secure “provided the no-signaling condition is satisfied”. But what does this mean? That no information can travel in-between the two devices within the amount of time that elapses between the moment at which the choice of basis is made, and the moment at which the device produces an outcome. There is a fundamental ambiguity here: how is the time of “basis choice” defined? Does it correspond to the moment when the experimenter first sets her mind on a particular setting (whether and how much “liberty” she really has in this choice is a different issue, the so-called “free will loophole”)? Or does it correspond to the moment at which the device is “informed” of that choice, as the experimenter presses the appropriate button? Or is it really only when the information reaches the photon polarizer located inside the device? The same questions can be asked for the time of “outcome production”: when exactly is the outcome determined? When the photon hits the receptor? When the outcome is displayed on the device? When the experimenter stares at it? Once more this is a subtle issue, related to the measurement problem in quantum mechanics and which components of the system are modeled as quantum and potentially in coherent superposition, and which are classical and “decohered”. Going from the one to the other requires an application of the Born rule, and it is not clear at what level this should take place.
  2. Input randomness (the information-theorist’s concern): This is an important point, which I think is now well-understood but was certainly not when procedures for device-independent randomness certification were first being discussed. The question is how the randomness present in the input to the devices should be quantified — succinctly, “random to whom”? The crucial distinction is as follows: the input should be random-to-device, whereas the output should be random-to-adversary. That it is possible to transform the one to the other is, to a great extent, the miracle of device-independent randomness certification: even if the inputs are completely known to a third party, and even if that party is quantum and shares entanglement with the devices, as long as the inputs are chosen independently from the devices (very concretely, their joint state, when all other parties are traced out, is in tensor product form), the outputs produced are independent from the adversary (same formalization, but now tracing out the devices). For those interested in this issue I highly recommend having a look at the “equivalence lemma” of Chung, Shi and Wu (Lemmas 2.2 and 5.1 here).
  3. Detection loophole (the hacker’s concern): arguably this is the most openly discussed of Valerio’s four concerns, and I won’t get into it too much. The problem is that current state-of-the-art photon receptors have low efficiency, and more often than not fail to detect an incoming photon. Thus in order to claim that the statistics observed in an experiment demonstrate the violation of a Bell inequality one has to resort to the “fair sampling assumption”, which states that no-detection events occur independently from the basis choice provided to the device. For most experiments this assumption seems reasonable, but in the context of device independence, where one often claims that the devices “have been prepared by the adversary”, it is certainly not so. Nevertheless we may hope that continued experimental progress (or different measurement techniques, such as the use of diamonds in the Delft experiment) will eventually eliminate this loophole.
  4. (In)determinism (the philosopher’s concern): what if randomness simply doesn’t exist? Two of the three most widely debated attempts at making sense of probabilities in quantum mechanics, the many-worlds interpretation and Bohmian mechanics (the third being the Copenhagen interpretation), posit determinism at some level: in many-worlds, there is determinism for a being who can observe all universes; in Bohmian mechanics there is determinism for a “nonlocal” being who has access to all hidden variables. As Valerio pointed out, however, these need not be serious issues, as what we really care is that the randomness in the outputs produced, or the security of the key, are measured with respect to a being in our universe: indeed it should not be of any concern that the random bits are “determined” by some hidden variables if access to these hidden variables requires prior knowledge about the whole state of the universe.

There are a lot of valuable take-home messages from the talk. Valerio ended his talk by making the observation that while the violation of Bell inequalities implies the presence of randomness the converse is not necessarily true… This in turn prompts me to ask what really are the fundamental assumptions that are necessary and sufficient to guarantee randomness. Device independence, as it is now understood, relies on locality: if it can be assumed that certain events take place at positions that are isolated in space-time then certified fountains of randomness follow. But what if we are unsatisfied with the assumption — how necessary is it really? An interesting variation is to certify randomness based on the contextuality of quantum mechanics; this is a challenging setting both from an experimental and theoretical points of view and I expect it to receive more attention in the future. But there are possibilities that take us in completely different directions; as an obvious example pseudo-randomness offers the possibility of certifying randomness based on computational assumptions. Are we on the right track — what if we were to take the irreducible presence of randomness as an assumption, what consequences would follow? (See Yevgeniy Dodis’ talk, to be discussed later, for some consequences in cryptography).

Posted in Conferences, device independence, Talks | Tagged , , | 2 Comments

The nature of randomness and fundamental physical limits of secrecy


The entrance to STIAS

I spent the past five weeks at the Stellenbosch Institute for Advanced Study (STIAS), a research institute located in beautiful Stellenbosch, 40km North-West of Cape Town. I was participating in a program (“program” may be an ambitious term for a five-participant endeavour — let’s call it a “project” instead), a project then, organized around the theme of The nature of randomness and fundamental physical limits of secrecy. The project was initiated by Artur Ekert, who formulated its theme and secured the support of STIAS. It was a great experience, and I’d like to write a few posts about it. To set the stage I’ll get us started by writing a few things about Stellenbosch and STIAS. The following posts will be more scientifically focused; in particular I’ll try to cover some of the stimulating talks that we were lucky to hear as part of a three-day workshop that was held a couple weeks ago and formed the organ point of the prorgram — no, the project.


Stellenbosch university

STIAS is a very interesting place. It is a small research institute nested within the large and airy campus of Stellenbosch University, known locally as “Maties” (don’t ask), one of the best universities in Africa. The spirit of STIAS is comparable to that of the IAS in Princeton, although it is far smaller: there is a single two-floor building in which I counted 22 offices; as far as I know aside from the director and administrative staff there is no permanent member. The scientific thrust of the institute is provided by its “fellows”, who typically spend a ~4-month period there (our project was thus shorter than average). Fellows are simply individuals who spend some research time at the institute, either on leave from their own institute or as part of a research project carried out with local collaborators. Concretely the institute provides office space, administrative support, and a stipend to cover local expenses. Much beyond this however STIAS provides a wonderful, relaxed and supportive environment in which to carry out research; a place which could easily top anyone’s list for best location to carry out a sabbatical leave.


Favorite study spot

Concretely, STIAS has two main attractions. The first is the local fauna — the STIAS fellows. During my stay among us could be found two philosophers, two biologists, a writer, two sociologists and three historians; not to mention the small group of eccentric physicists who insisted on dragging a whiteboard back and forth between the coffee machine, the sofas and the patio. The second attraction are the lunches. The institute staff cultivates its own little garden and prepares wonderful quiches, pies, salads, and, obviously, cakes, panna cotta, mousses and other culinary treats daily. Bring the two together and you end up with a unique epicurean experience, with the group of fellows gathering every day around a leisurely but stimulating lunch fuelled as much by the delicious food as by t
he wide-ranging conversation. (Note how the recipe closely parallels, for all I can tell, that of the IAS. Admittedly STIAS doesn’t have the cookies — but it does
offer an extensive wine & cheese reception following the “fellow’s seminar” held every


Lunch at STIAS

To set a broader stage, it may be worth mentioning that, as I am sure its director Hendrik Dreyer would not contradict, the culture of STIAS is not unaffected by Stellenbosch’s status as the heart of South African wineland. Indeed the surrounding hills are planted with the second oldest (1679!) vines of South Africa, right after the nearby Constantia and home of the famous pinotage variety of grapes, a cross-product of Pinot and Hermitage which brings out wines of a spirit that could be compared to the best frmo Burgundy. The city itself is rather quaint, populated by a mix of locals, tourists and students. The latter ensure the presence of a healthy coffee culture — and here I cannot help but mention the blue crane and butterfly, purveyor of my daily morning cortado and a highlight of my stay here. The surrounding landscape, dominated by beautiful mountains culminating in wide plateaus, is filled with wineries competing for the best view, lunch spot, and of course, wine tasting. Having been wine-educated in the best snobbish French tradition I was truly impressed by the quality of local wines; most of those I tried were quite balanced and subtle. Moreover, they are enjoyed as wine should: without pretension, and with good food and company! A pleasure that is not limited by the very affordable prices, with the good bottles being priced similarly to the daily liter of red in France, Italy or Spain, well under 10 euro. (I do realize that the notion of “affordable” is a complicated one here, South Africa being one of the countries in the world with the highest income inequality. At a political level our stay was dominated by intense student protests across South African universities in reaction to a sudden hike in registration fees decided by the government, which eventually had to back down.)

But enough tourism, let’s get to the science. In the next post we’ll start exploring “The nature of randomness and fundamental physical limits of secrecy”!

Posted in Conferences, Quantum | Tagged , , | Leave a comment

Hypercontractivity in quantum information theory

In this post I’d like to briefly mention some connections between log-Sobolev inequalities, and more specifically hypercontractivity, with quantum mechanics and quantum information. To anyone interested I recommend the video from Christopher King’s excellent overview talk at the BIRS workshop. The talk includes a nice high-level discussion of the origins of QFT (roughly 25 minutes in) that I won’t get into here.

Existence and uniqueness of ground states in quantum field theories. The very introduction of (classical!) hypercontractivity, before the name was even coined, was motivated by stability questions in quantum field theory. This approach apparently dates back to a paper by Nelson from 1966, “A quartic interaction in two dimensions” (not available online! If anyone has a copy…my source on this are these notes by Barry Simon; see section 4). Nelson’s motivation was to establish existence and uniqueness of the ground state of an interacting boson field theory that arose as the natural quantization of a certain popular classical field theory. Let {H} be the resulting Hamiltonian. {H} acts on the Hilbert space associated with the state space of a bosonic field with {n} degrees of freedom, the symmetric Fock space {\mathcal{F}_n}.

The main technical reason that make bosonic systems much easier to study than fermionic ones is that bosonic creation and annihilation operators commute. In the present context this commutativity manifests itself through the existence of an isomorphism, first pointed out by Bargmann, between {\mathcal{F}_n} and the space {L^2({\mathbb R}^n,\gamma^n)}, where {\gamma} is the Gaussian measure on {n}. The connection arises by mapping states in {\mathcal{F}_n} to functions on local observables that lie in the algebra generated by the bosonic elementary creation and annihilation operators. Since these operators commute (when they act on distinct particles), the space of such functions can be identified with the space of functions on the joint spectra of the operators; doing this properly (as Bagmann did) leads to {L^2({\mathbb R}^n,\gamma^n)}, with the creation and annihilation operators {c_i} and {c_i^*} mapped to {\frac{\partial}{\partial x_i}} and its conjugate {(2x_i-\frac{\partial}{\partial x_i}} respectively.

The simplest “free field” bosonic Hamiltonian is given by the number operator

\displaystyle H_0 \,=\, \sum_i c_i^* c_i, \ \ \ \ \ (1)

which under Bargmann’s isomorphism is mapped to {\sum_i 2x_i\frac{\partial}{\partial x_i} - \Delta}. Up to some mild scaling I’m not sure I understand, this is precisely the Liouvillian associated with the Ornstein-Uhlenbeck noise operator that we saw in the previous post!

Here is how hypercontractivity comes in. {H_0} is Hermitian, and using Bargmann’s isomorphism we can identify {e^{-t H_0}} with a positive operator on {L^2({\mathbb R}^n,\gamma^n)}. Performing this first step can already provide interesting results as, in the more general case of a Hamiltonian {H} acting on a field with infinitely any degrees of freedom, it lets one argue for the existence of a ground state of {H} by applying the (infinite-dimensional extension of the) Perron-Frobenius theorem to the positive operator {e^{-tH}}. But now suppose we already know e.g. {H\geq 0}, so that {H} does have a ground state (there is a lower bound on its spectrum), and we are interested in the existence of a ground state for {H+V}, where {V} is a small perturbation. This scenario arises for instance when building an interacting field theory as the perturbation of a non-interacting theory, the situation that Nelson was originally interested in. The question of existence of a ground state for {H+V} reduces to showing that the operator {e^{-t(H+V)}} is bounded in norm, for some {t>0}. Using {\|e^{A+B}\|\leq \|e^A e^B\|} it suffices to bound, for any {|\varphi\rangle},

\displaystyle \|e^{-tV}e^{-tH}|\varphi\rangle\|_2 \leq \|e^{-t V}\|_4 \|e^{-tH}|\varphi\rangle\|_4,

where the inequality follows from Hölder. Assuming {\|e^{-t V}\|_4} is bounded (let’s say this is how we measure “small perturbation” — in the particular case of the quantum field Nelson was interested in this is precisely the natural normalization on the perturbation), proving a bound on {\|e^{-t(H+V)}\|} reduces to showing that there is a constant {C} such that

\displaystyle \|e^{-tH}|\varphi\rangle\|_4\leq C \||\varphi\rangle\|_2 \ \ \ \ \ (2)

for any {|\varphi\rangle}, i.e. {e^{-tH}} is hypercontractive (or at least hyperbounded) in the {2\rightarrow 4} operator norm. This is precisely what Nelson did, for {H} equal to the number operator (1), i.e. he proved an estimate of the form~(2) for the Ornstein-Uhlenbeck process, later obtaining optimal bounds: hypercontractivity was born! (As Barry Simons recalls in his notes the term “hypercontractive” was only coined a little later, in a paper of his and Hoeg-Krohn.)

Fermionic Hamiltonians and non-commutative {L^p} spaces. This gives us motivation for studying hypercontractivity of operators on {L^2({\mathbb R}^n,\gamma^n)}: to establish stability estimates for the existence of ground states of bosonic Hamiltonians. But things start to get even more interesting when one considers fermionic Hamiltonians. Fermions are represented on the antisymmetric Fock space, and for that space there is no isomorphism similar to Bargmann’s. His isomorphism is made possible by thinking of states as functions acting on observables; if the observables commute we obtain a well-defined space of functions acting on the joint spectrum of the observables, leading to the identification with a space of functions on {{\mathbb R}^n}, where {n} is the number of degrees of freedom of the system.

But the fermionic creation and annihilation operators anti-commute: they can’t be simultaneously diagonalized. So states are naturally functions on the non-commutative algebra {\mathcal{A}} generated by these operators. Apparently Segal was the first to explore this path explicitly, and he used it as motivation to introduce non-commutative integration spaces {L^p(\mathcal{A})}. (As an aside, I find it interesting how the physics suggested the creation of such a beautiful mathematical theory. I used to believe that the opposite happened more often — first the mathematicians develop structures, then the physicists find uses for them — but I’m increasingly realizing that historically it’s quite the opposite that tends to happen, and this is especially true in all things quantum mechanical!) For the case of fermions the canonical anti-commutation relations make the algebra generated by the creation and annihilation operators into a Clifford algebra {\mathcal{C}}, and the question of existence and uniqueness of ground states now suggests us to explore the hypercontractivity properties of the associated semigroup (third example in the previous post) in {L^p(\mathcal{C})}. This approach was carried out in beautiful work of Gross; the hypercontractivity estimate used by Gross was generalized, with optimal bounds, by Carlen and Lieb. (I recommend the introduction to their paper for more details on how hypercontractivity comes in, and Gross’ paper for more details on the correspondance between the original fermionic Hamiltonian and the semigroup for which hyperocntractivity is needed.)

Quantum information theory. As Christopher King discussed in his talk at BIRS, aside from their use in quantum field theory non-commutative spaces and hypercontractivity are playing an increasingly important role in quantum information theory. This direction was pioneered in work of Kastoryano and Temme, who showed that, just as classical hypercontractivity and its connection with log-Sobolev inequalities has proven extremely beneficial for the fine study of convergence properties of Markov semi-groups in the “classical” commutative setting (cf. my previous post), hypercontractivity estimates and log-Sobolev inequalities for quantum channels could lead to much improved bounds (compared with estimates based on the spectral gap) on the mixing time of the associated semi-groups. In particular Kastoryano and Temme analyzed the depolarizing channel and obtained the exact value for the associated log-Sobolev constant, extending the classical results on the Bonami-Beckener noise operator that I described in the previous post. The depolarizing channel is already very important for applications, including the analysis of mixing time of dissipative systems or certain quantum algorithms such as the quantum metropolis algorithm.

For more applications and recent developments (including a few outliers…), see the workshop videos!

Posted in Conferences, Quantum | Tagged , , , | 7 Comments