One week later…

…and 736 words of expectation for the class:


Note the top contender: let’s see if we live up to their expectations!

It’s been a fun first week. We released “Week 0” of the material a month ahead of the official start date, so that those students not yet familiar with the basics of quantum information (what is a qubit?) would have enough time to digest the fundamental notions we’ll be using throughout. (Out of the ~5500 registered students, ~1250 are currently marked my EdX as “active” and 560 have watched at least one video. Keep it coming!)

An unexpected benefit of opening up the platform ahead of time is that it is giving us the time to experiment with (read: debug) some of the tools we plan to use throughout. A first is EdX’s own possibilities for interaction with the students, an example of which is pictured above (“Use the space below to enter a few words that best characterize your expectations for this class”).But we’re also using a couple add-ons:

The first is a system replacing EdX’s discussion forums,called Askalot. Cute name – how can you resist. The main benefit of Askalot is that it provides more structure to the discussions, which can be characterized as question/bug report/social discussion/etc, can be up/down-voted, marked as correct/invalidated by the instructors, etc. The students are having fun already, introducing themselves, complaining about bugs in the quizzes, and, of course, about Askalot itself! (Thanks go to Ivan Srba, one of the creators of Askalot, for being extremely responsive and fixing a host of minor bugs overnight – not to mention throwing in the extra feature requested by the students.)

A second is called DALITE. The idea of DALITE is to encourage students to provide an explanation justifying their answer to a question. Indeed, one of the drawbacks of the online platform is the need for automatic grading of assignments, which greatly limits how the student can be engaged in the problem-solving exercise, mostly limited to multiple-choice or simple numeric answers. DALITE (which grew out of, and is still, a serious research project in online learning) introduces a fun twist: the student is asked to type in a “rationale” for her choice. Of course there is no way we could grade such rationales. But here is the idea: once the student has entered her explanation, she is shown the rationale provided by another student (or the instructor) for a different answer, and asked whether she would like to reconsider her decision. The student can choose to change her mind, or stick with her initial choice; she is asked to explain why. It’s really fun to watch the answers provided (“What would happen if quantum mechanics allowed cloning of arbitrary states?”), the change of minds that take place, and the rationale that incentivized said change of mind. (Thanks to Sameer Bhatagnar for helping us set up the tool and explaining its many possibilities, and to Joseph Williams for suggesting its use in the first place.)

We’ll see how these pan out in the longer run. I’m definitely eager to experiment with ways to make the MOOC experience a better learning experience for the students. I’ll let you know how it goes. Suggestions welcome!

PS: 47% students 25 and under, 78% from outside US, is good, but 16% female is not…come on!


Posted in QCrypto, teaching, Uncategorized | Tagged , , , | Leave a comment

Coming to a theater near you

qcryptoThis Fall together with long-time friend and collaborator Stephanie Wehner I will be teaching a Caltech/TUDelft/EdX course on Quantum Cryptography. As I am writing this I almost feel like saying I have taught an EdX course…indeed, I just finished recording the last video for the course! Let me tell you right away that, when you’ll eventually get to see this fool absurdly gesticulating in front of at tablet, well – it was no small effort! (Not only mine: I’ll have a chance to return to this, but a serious hat tip to the Caltech production team is already in order.) This is the first post of a few in which I plan to share my experience preparing for, and then running, the course (which is officially starting on October 10th).


More often than not the first question I get asked is – why? Why did you decide to do this?? As a teacher you’re used to torturing a couple dozen students each semester… but now, not only do you want to ramp it up and torture thousands at a time, but you also want to torture yourself???

Let’s see. I can find at least two reasons. The first is rather selfish: I’m curious. I want to give it a try. My interest in the possibilities of online education, as is the case for many of us, started around 2011, during the “big boom” that followed the success of Sebastian Thrun and Peter Norvig’s Artificial Intelligence MOOC at Stanford. It is through discussing the experience of my Ph.D. advisor, Umesh Vazirani, who taught one of the early MOOCs (and still the only one on quantum computing), that I got more seriously interested in the medium. As everything he does, Umesh took the course deep at heart, and taught it passionately, striving to pack his signature “nuggets” of insight into the rigid format of a 10-minute video meant to be accessed “massively online”. Umesh had great hopes for the medium, and his course was wildly successful. It has certainly been cited as their main source of education in quantum information by many a Caltech Ph.D. applicant!


Recognize anyone?

I have the impression that the initial explosion of interest in MOOCs that took place in 2012-1014 has subsided somewhat, and the medium may be going through a phase of soul-searching: given that it is now clear MOOCs will not cure all the world’s ills, can they at least be useful for something? The “online education” medium certainly has its own challenges. As a friend working in the psychology of learning put it somewhat bluntly, “students don’t learn anything from videos – don’t waste your time” (too late, Joseph – I had started recording already). There is some truth to this (though as long as we don’t forget the 2x button there’s still a chance the students might condescend to skimming through some fraction of the videos in a quick hunt for hints towards the pset solutions).

The point though is that broadly interpreted “online education” should have much more potential for engaging the students than simply dumping video clips on them . This is what I’m interested in exploring: the extent to which it is possible to set up a stimulating, useful interactive process for the students, with some necessary “downtime” spent reading notes and working problems on their own, and some “uptime” spent watching videos but also participating in forums, thinking through stimulating questions, checking out other student’s answers, etc.

The second reason is perhaps less self-centered. I think we picked the right topic. A two-word title, each of which bound to associate with a whole range of exciting, if nebulous, concepts in the mind of any young apprentice-scientist, what is not to like? Quantum cryptography has recently (and somewhat justifiably) been drawing a lot of attention, from the expanding range of start-ups to the NSA through Chinese rockets. But surprisingly few resources are available to whomever seeks a serious introduction to the subject. While great books on quantum computing and quantum information are popping up, none gives much attention to quantum cryptography. is no book on quantum cryptography, few if any lecture notes (I searched, and the only consistent notes I could find are from a course taught by Unruh in 2014), few surveys (see this recent one by Broadbent and Schaffner, which however remains at a very high level), and a scientific literature that is not easy to navigate.


So this is what Stephanie & myself set to do, share our enthusiasm for quantum cryptography with you, young enthusiasts! What will the course be about? If you care you should watch the promo video, sign up, and take the course!

We hope to use the ten weeks we have to take the absolute beginner in quantum qkdinformation and cryptography to the level where she has a solid conceptual and mathematical understanding of quantum cryptography. A large chunk of the course will be devoted to the description and security analysis of protocols for quantum key distribution (QKD). This includes the thorny notions associated with measuring security with respect to quantum adversaries, and the recent paradigm of device independence. Beyond QKD, we also aim to give a broad overview of other feats of quantum cryptography, including primitives in two-party cryptography such as bit commitment or coin flipping, the noisy storage model, position-based cryptography, and delegated computation. Unfortunately we will not have enough time to cover quantum attacks on classical cryptosystems or post-quantum cryptography, leaving space for a companion “quantum cryptanalysis” course – volunteers?.

A word of caution: a solid background in linear algebra will be required for the course. Or at least, a will to put in the serious effort that will be required to follow the material. Making the course accessible does not mean dumbing it down, and the less mathematically inclined might find it challenging. The upshot, though, is that whomever sticks around will find it an intellectually rewarding experience, as the course will bring you to a stage where you’re basically ready to start doing research in the area. Not to scare everyone away: the course will start slow, with a “Week 0” of background material rolling out now, and the first couple weeks devoted to an introduction to the important notions of quantum information. My a priori assessment is that any 3rd or 4th year Caltech undergrad should have no problem taking the course. We’ll see how it plays out, but you should certainly try!

I will be teaching the course “inverted classroom” style at Caltech (as will Stephanie in Delft). The students will be asked to watch the videos, and more generally go through the online content, ahead of regular class meetings. In class, depending on the level of the students and their understanding of the material we will provide additional explanations or – as I hope will be possible – go further and develop our own projects extending the online material in directions of interest to the students.

This is a new experience for me, and I already stumbled through some of the beginners’ mistakes. But it’s fun! And, I hope, worth it. I’ll keep you updated. If anyone has experience (as I’m sure some of you do), unsolicited advice is more than welcome!


Posted in QCrypto, Quantum, teaching | Tagged , | Leave a comment

flour + water + salt + time =

And there goes my last worry about the possibility of life in California…check! Would never have imagined it’d be so easy.

Posted in random | Tagged | 2 Comments

Foundations of randomness, Day 3

Last day! Wednesday started with a bang. Yevgeniy Dodis spent the whole Tuesday evening (our workshop dinner at Lanzerac!) complaining that we weren’t leaving him enough time to prepare for his talk…only to spend an equal amount of time during his talk complaining that he wasn’t going to be able to cover half of the material he had prepared! (To view things positively, this says something about the level of interaction during the talks — Yevgeniy was flooded with questions; quite a show of force for a post-workshop dinner morning.)

Dodis – Randomness in cryptography

I regret not having saved a picture of the table Yevgeniy drew on the whiteboard for his talk. There were four columns: no randomness, weak randomness, local or public randomness, and perfect randomness. For each column Yevgeniy listed four cryptographic scenario: soundness (for which classes of computational problems are there sound interactive proof systems), authentication (secure implementation of primitives such as message authentication or digital signatures), privacy (tasks such as bit commitment, secret sharing, or even public- or private-key encryption) and extraction (is deterministic extraction of uniform randomness possible). The scenario are listed in increasing order of difficulty, and the question is how far each kind of randomness lets us go down the list.

The first and last columns are straightforward to fill in. In the first, corresponding to a world with no randomness, there is very little one can do. In terms of soundness it is possible to verify languages in NP, but no more; cryptographic tasks are clearly impossible (determinism does not allow for uncertainty — at least for this talk!), and certainly extraction does not make much sense. If, on the other hand, perfect randomness is available many feats can be accomplished. It is a milestone of complexity theory that all languages in PSPACE have interactive proof systems; the hard work of cryptographers over the past couple decades demonstrates an impressive range of cryptographic tasks that can be implemented with or without computational assumptions. And of course extraction is not a problem.

Things become interesting with the second column. Suppose given access to a source of randomness taken from a family of sources which have a guaranteed min-entropy rate, but from which it is not possible to deterministically extract uniformly random bits (such as Santha-Vazirani sources). Can such a source still be used to implement non-trivial cryptographic tasks?

Here Yevgeniy drew our attention to an interesting distinction between authentication and privacy. Indeed, the latter is much more demanding in terms of the “quality” of the randomness used. For an adversary to break an authentication task, such as a message authentication code (MAC), it is required that the adversary be able to forge a certain object — e.g. a fake signature for an unknown message for which another valid signature is known. In contrast, for the adversary to break a privacy task, such as encryption, it is sufficient for the adversary to distinguish between two distributions — encryptions of {0} and encryptions of {1}. Thus at a high level it is not too surprising that it should be possible to implement authentication tasks in a way that is secure as long as the sources of randomness used to implement the task has a guaranteed min-entropy rate. Privacy tasks, on the other hand, seem to require much more structure from the random source. In fact, if one thinks about it the task of secure encryption is very closely related to that of extraction, and one may readily expect that privacy can only be achieved if the family of sources allows for deterministic extraction. This connection is made formal in a paper by Bosley and Dodis.

Returning to his second column, Yevgeniy introduced the notion of an {(t,\delta)}-expressive source, which informally is a class of sources such that for any two efficiently computable functions {f} and {g}, if {f} and {g} differ on a significant (at least {2^{-t}}) fraction of inputs (measured under the uniform distribution) then there is a source in the family such that the expected statistical distance, when evaluated on that source, between {f} and {g} is at least {\delta}. That is, a family is expressive if it “separates” pairs of functions that differ significantly. Now the main results are:

  1. Many common families of sources are expressive — including all families from which deterministic extraction is possible, but also e.g. Santha-Vazirani sources and various generalizations thereof.
  2. No privacy task can be achieved if the only available randomness is taken from an (a priori unknown) source from an expressive family. Proof sketch: consider an encryption scheme. Of course encryptions of {0} and encryptions of {1} should be different most of the time (otherwise decryption will fail). Therefore by the definition of an {(t,\delta)}-expressive family, for say {t=0} or {t=1}, there should be a source in the family under which both types of encryption can be efficiently distinguished — contradicting security.
  3. Some authentication tasks, including MACs and digital signature schemes, can be implemented from some specific expressive families, such as Santha-Vazirani sources or their generalization block min-entropy sources.

The proof of the second point above is trivial, but the result is quite interesting, highlighting the importance of making the right definitions! I am not familiar at all with this area, but it seems that the notion of expressive family of sources is a meaningful one to have in mind. Now I only wish Yevgeniy had time to say something about the third column (public vs private randomness)…unfortunately it will have to be for another time.

Fawzi – Exams

The following talk was given by Omar Fawzi, soberly entitled “Exams” (last day…it had to come!). Omar considers the following scenario. Suppose an exam consists of {n} possible questions {X_i}, and a student has some information, or knowledge, {B} based on which she attempts to answer the questions that are asked to her. Suppose the student achieves a certain score when asked about a random subset of {k} of the {n} possible questions — technically, given indices {\{i_1,\ldots,i_k\}} the student produces guesses {A_{i_1}(B),\ldots,A_{i_k}(B)} such that {A_{i_j}(B) = X_j} for a fraction {\Delta} of indices {j\in\{1,\ldots,k\}}, on average over the (uniformly random) choice of the {k} indices and the student’s randomness (including the random variable {B} and the map {A}).

Can we deduce that the student simultaneously “knows about” a fraction {\Delta} of all possible questions in the exam — if she was asked about all {n} possible questions at once, would she typically achieve the same score as when asked only {k} of them? To make this a little more precise and see why it is not obvious, consider the following simple scenario. The correct answers {X_i} are either all {0}, or all {1}, each with probability {1/2}. The student’s information {B} consists of two bits; the first bit {B_1 = X_i} (for all {i}), and the second bit {B_2} is an independent coin flip. The student’s strategy is to always answer {B_1\oplus B_2}. Note that this student is not “optimal”, as she has “good days” ({B_2=0}) on which she gets everything right, and “bad days” on which she scores {0}. Our task is not to extract an optimal student, but to reproduce this particular student’s behavior at the scale of the whole exam: we want a strategy that gets the whole exam right half the days, and the whole exam wrong the remaining time — not a strategy that gets half the exam right on every day! So you see that the most natural “rounding” procedure, which would decompose the set of {n} questions into {(n/k)} groups of {k} and look up the student’s answers to each group, will not work. One needs to take into account the randomness in {B} and in the student’s strategy to extract a global strategy that is consistent with the local behavior.

The scenario is reminiscent of what are known as “direct product tests” in the PCP literature, and there is a long sequence of works on the topic, see e.g. Dinur and Reingold, Impagliazzo et al. and Dinur and Steurer. As far as I could tell there are differences that make the problems incomparable: for instance, direct product tests attempt to match two consistent local views to a global object, whereas here the exam is guaranteed to pre-exist; direct product tests consider complete agreement instead of fractional agreement {\Delta} here; etc. But the problems are similar enough that you’d expect techniques used to approach the one could be useful for the other as well. For anyone interested, see Omar’s slides for a more precise formulation of the problem he considers.

The situation is even more interesting in the quantum setting. Suppose the exam remains classical (though one could consider quantum exams as well…), but the student’s information {B} is a quantum state. Then all we know is that for every set of {k} questions the student has a measurement on {B} that produces answers with a certain success rate. Can these measurements be combined in a way to provide answers to all questions in the exam simultaneously? Since any measurement will necessarily affect the student’s state we can’t simply repeatedly measure {B}, and monogamy prevents us from taking many copies of the state (which couldn’t all simultaneously be correlated with {X} in the appropriate way). The problem has a very similar flavor to the one faced when proving security of randomness extractors against quantum side information. Again there are subtle differences, but one may expect some of the techniques developed in that area (such as the work of Koenig and Renner on sampling-based extractors) to be useful here.


The day ended with two afternoon talks each highlighting a number of open questions in device-independent randomness certification, and some leeway that the speakers have recently achieved. This left us with a solid set of take-home problems to think about!

Pironio – Some open questions on DI randomness generation

The first talk was given by Stefano Pironio, who picked three of his favorite open questions in device-independent randomness, and had time to tell us about two of them in some detail.

Stefano’s first question asks whether it is possible to construct deterministic extractors for quantum sources. Although it is unclear whether this would have any application, it is a very natural question. The main observation is that the “raw” random bits that are produced in a device-independent randomness extraction procedure (say, the outputs of Alice’s device in a protocol which sequentially repeats the CHSH game) not only have a guaranteed min-entropy rate, but they also have additional structure that comes from the way they are produced. What this structure is, and whether it is sufficient to guarantee deterministic extraction, is precisely the content of the problem. Stefano showed us a specific inequality that he could show always holds, and hoped would suffice: if {A_i\in\{-1,1\}} is the {i}-th bit output by the source, then the average of all correlators

\displaystyle \frac{1}{2^n}\Big| \,\sum_{S\subseteq [n]}\, \big|\langle \Pi_{i\in S} A_i \rangle\big|\, \Big| \,\leq\, 2^{-\delta n},

for some small {\delta} depending on the CHSH violation observed. In fact Stefano claimed the inequality is already enough to show, via the probabilistic method, that a deterministic extractor should exist; thus it remains to find one…This is an appealing question, and I wouldn’t be surprised if it can be solved by the pseudorandomness experts out there!

The second problem Stefano presented has to do with the relation between randomness and entanglement. Suppose that the devices used in a randomness expansion procedure share {n} “ebits”, or entangled bits. How many bits of randomness can be generated? This is both of theoretical and practical interest: entanglement is hard to come by, and of course we want to get the most out of whatever we have. All known protocols generate {O(n)} random bits from {n} ebits, but there is no reason for this to be a hard limit; intuitively entanglement is required for testing the devices, but less directly so for generating randomness (see also the next talk by Acin). And indeed, Stefano showed us that it is possible to go further: he has a protocol which generates {n^{1+c}} randomn bits, for some small {c}, using {n} ebits.

The last question, on which we unfortunately did not have time to go into, is of obtaining guarantees on the randomness that hold against “post-quantum” adversaries, limited only by the no-signaling principle. This is a fascinating problem on which very little is known. There are no-go results for privacy amplification in this setting, but they only apply in rather stringent scenario; moreover they pose no immediate obstacle to guaranteeing entropy, rather than uniform randomness. The problem is challenging because the general non-signaling framework gives us very little to work with: a family of conditional distributions, but no such thing as a state, a measurement, a post-measurement state, etc.; very few tools are available. On the other hand it makes the problem mathematically very clean, almost “classical” (no more Hilbert spaces, no more quantum!). I only wish I had more intuition for it!

Acin –  How much randomness can be certified in quantum and general non-signalling theories?

In the last talk of the workshop Antonio Acin asked the question — “How much randomness can be certified in quantum and general non-signaling theories?” — and gave us some rather counter-intuitive answers. As already mentioned in Stefano’s talk, the prevailing intuition is that the amount of randomness that can be extracted is directly related to the amount of “quantumness” present in the devices, where “quantumness” would be directly related to entanglement, since entanglement is clearly the “non-classical” feature exploited here. But this need not be so: even though entanglement is necessary for certifying the device, it is not a priori required that it serve as a resource that gets depleted as more and more randomness gets produced.

As a first result Toni showed how more randomness could be extracted from a single measurement, on a single entangled EPR pair, by considering general POVM measurements instead of projective ones. It is known (and not hard to show) that making POVM measurements on an EPR pair can only yield at most one bit of randomness per site. Toni introduced a four-outcome Bell inequality such that a maximal violation of the inequality requires the use of a non-projective four-outcome measurement on one half of the EPR pair, and a state such that each of the four outcomes has equal probability {1/4} when the measurement is performed — thus yielding two uniformly random bits from a single qubit!

This is unexpected, but perhaps not groundbreaking: one bit, two bits, ok. But then Toni dramatically scaled things up. His second result is that by performing repeated measurements on a single, constant-size entangled state (I think in his scheme this is again just a pair of qubits) one could generate an infinite sequence of random bits, containing an amount of entropy which certifiably goes to infinity — provided the requisite Bell violation is guaranteed.

If you’re wondering about the relationship between this result and the one presented in the talk by Henry Yuen on infinite randomness expansion, there are some important differences. First of all Toni’s result applies under the assumption that the device maximally violates a certain Bell inequality, but there are no bounds on the accuracy required: thus in principle testing for a sufficiently-close-to-optimal violation could require an arbitrarily large number of runs; in Henry’s scheme the testing is fully accounted for. On the other hand Henry’s scheme requires the devices to have an unbounded number of EPR pairs available, whereas the main point of Toni’s scheme is that a single pair can be re-used over and over again.

The result is a proof of concept (another limitation is that the Bell inequality requires to perform a measurement on one of the devices that has an exponential, in the number of random bits produced, number of possible settings — the other device being measured sequentially using measurements with two settings), but I again find it rather unexpected. The whole area of device independence is turning out to be a great testbed for many of our (bad) intuitions about quantum mechanics and nonlocality!

Toni ended his talk with a third intriguing result, relating algorithmic randomness to information-theoretic randomness as we usually understand it. The main observation is the following. Suppose a classical Turing machine is used to prepare a sequence of qubits where each qubit is either initialized in one of the two possible computational basis states, or each qubit is initialized in one of the two possible basis states in the conjugate basis, the Hadamard basis. Then there is an algorithmic procedure that always succeeds in distinguishing which is the case! Note that the Turing machine may very well be producing a sequence that for all practical matters appears fully random, so that one can for instance show that no procedure which makes a single choice of basis and measures all qubits in that basis will be able to tell if it got the right basis or not. Instead the trick is to use both bases: measure half the qubits in the computational basis, and half in the Hadamard basis. In this case, one of the sequences obtained will be uniformly random for sure, and the other will be the sequence produced by the Turing machine — and as it turns out, it is possible to distinguish the two algorithmically. See the paper for more details, and further results along those lines.


The day ended with a very brief “open problems session”: Toni wrote down some of the questions that were raised during the workshop and we all commit to solve by the next iteration. I’ll leave you with a picture of the board. There are certainly many other questions we could ask, some of which are interspersed in these blog posts; feel free to suggest more!

Posted in Conferences, device independence, Quantum, Talks | Tagged , , , , , | Leave a comment

Foundations of randomness, Day 2

Back to the fundamentals of randomness and our workshop in Stellenbosch – I want to discuss the remaining two days of talks!

The second day started off with a whirlwind tour of the TCS approach to pseudo-randomness by David Zuckerman. David surveyed known constructions of pseudo-random generators and extractors; he also gave a very comprehensive overview of the different types of sources that have been considered in this area and what kind of extraction (deterministic, seeded, two-source) is known to be possible (or not) for each type of source.

I’ll refer you to David’s slides for details. Two particular sources caught my attention. First David mentioned small-space sources, for which there are good deterministic extractors but so far only in the regime of high (polynomial) min-entropy. This type of source seems quite relevant in practice, and it is natural to ask whether it is possible to do better using device-independent quantum extraction. How could we leverage the fact that the source is generated using bounded space? A second interesting category of sources are non-oblivious bit-fixing sources, which are sources such that some of the bits are uniformly random, and others deterministic, but such that which bits are of which type can be chosen adversarially by the source based on the values taken by previous bits. Together with Santha-Vazirani sources these are the two examples David gave of natural sources from which deterministic extraction is not possible. But we do know that it is possible to extract from a single Santha-Vazirani source, of bias arbitrarily close to {1/2}, using a device-independent quantum procedure. So what about non-oblivious bit-fixing sources — can it be done there as well?

Work on device-independent randomness extraction from the quantum information community has so far focused on a small set of sources — uniform seeds of limited length, arbitrary Santha-Vazirani sources, and more recently general min-entropy sources — but I see no reason why these are the only cases of interest; the restricted focus seems to be mostly for historical reasons. In pseudo-randomness different types of sources are often motivated by applications to derandomization, lower bounds, or combinatorial constructions. In quantum information, rather than “beating the classical” we should focus on the scenario that are best motivated by the relevant applications, which so far come mostly from cryptography. Here device-independent randomness certification procedures seem most relevant in their role as “decouplers” (see the discussion on “random to whom” from Valerio’s talk). What kind of structure are we willing to assume on the kind of information an adversary may have kept on a particular source? This question seems particularly relevant in light of recent works on device-independent two-source extraction.

David’s talk was followed with a talk by one of his students, Eshan Chattopadhyay, on their recent breakthrough construction of a two-source extractor for poly-logarithmic min-entropy. It is hard to do any justice to such an intricate (and beautiful!) result, and instead I will point you to Eshan’s slides or the great talk that David gave on the same topic on TCS+ just a few weeks ago.

For the last talk of the morning session Henry Yuen treated us to an (arguably :-)) even more formidable treat: infinite randomness expansion! Henry presented the main steps that led him and Matthew Coudron to their beautiful result showing how, starting from just a few uniformly random bits, one could bootstrap a quantum device-independent expansion procedure and generate as many uniformly random bits as one desires — yes, that’s an {\infty}! (I realize a bit late that I never properly explained what “device-independent” even means… if I have any “classical” readers left, for context for Henry’s talk see a blog post of Henry’s for all the required background; for the quantum crypto motivation see the viewpoint by Roger Colbeck; for the computer scientist perhaps the introduction to this paper is readable.)

In Henry and Matt’s protocol the length of the initial seed only needs to be at least some universal constant. The number of random bits available to start with will govern the security parameter (the distance from uniform) of the bits produced, but aside from that any small initial seed will do. The protocol uses two pairs of devices (those can be arbitrarily correlated, e.g. share randomness or quantum entanglement, but are not allowed to communicate directly) which take turns in expanding the current string of bits by an exponential factor. The key observation required to argue that the protocol works is that each expansion step, not only increases the number of bits available, but simultaneously “purifies” them, in the sense that the bits produced will have high entropy even conditioned on any information, classical or quantum, in the possession of the other pair of devices. Given that this other pair generated the input bits used for the expansion in the first place this is not a trivial observation, and is  essentially the content of the powerful “equivalence lemma” of Chung, Shi and Wu discussed in the post describing Valerio Scarani’s talk. The lemma guarantees that the bits produced by a device-independent procedure are random even from the point of view of an adversary who shares arbitrary correlations with the seed (including knowing it perfectly).

The result is quite impressive: if a few uniformly random bits are available, then it is automatically guaranteed that infinitely many just-as-good (in some respects even better) random bits can be generated! So much for our derandomization problems… The next step is to ask if uniformly random bits are even needed: how about weaker sources of randomness, such as Santha-Vazirani sources or general min-entropy sources? The task of device-independent extraction from the former was taken on by Colbeck and Renner (with much follow-up work); for the latter there is the work of Chung, Shi and Wu, which unfortunately still requires a number of devices that scales with the length of the source as well as the security parameter. More to be done!

The afternoon brought two more talks, with a very different focus but equally stimulating. The first was delivered by Ruediger Schack, who presented “The QBist perspective on randomness and laws of nature”. Ruediger’s talk followed the one by Carl Hoefer on the previous day in challenging our attitudes towards randomness and the meaning of probabilities, but Ruediger took us in quite a different direction. One of the points he made is that Bell’s theorem can be interpreted as giving us the following mutually exclusive alternatives about the world: either locality does not hold, and we must accept that far-away objects can have a nonlocal (instantaneous) influences on one another, or we insist on keeping locality — as Einstein did — but ascribe a different “meaning” to probabilistic statements, or predictions, that arise from the Born rule. (Note that in the first option in “nonlocal influence” I am including the kind of non-signaling “influence” which arises from measuring half of an entangled pair.)

This is where QBism — for “Quantum Bayenism” — comes in. According to Ruediger, the solution it offers to this conundrum goes through the assertion that, contrary to e.g. the Copenhagen interpretation, “probabilities are not determined by real properties”. This allows one to “restore locality” while not having recourse to local hidden variables (as Bohmian mechanics would) either. Thus in QBism “there is no Born rule”, in the sense that the rule does not describe an intrinsic probabilistic fact about the world; rather, probabilities are seen as the reflection of an agent’s subjective belief and have no objective existence. The fact that, if Alice measures her half of an EPR pair in the computational basis and obtains a {|0\rangle}, then Bob will with certainty observe a {|0\rangle} as well, if he measures in the same basis, is not a “fact” about their systems; rather it is a prescription for how a rational Bob should update his belief as to the outcomes of an experiment he could perform, were he to be informed of the situation at Alice’s side.

The main conceptual tool used to formulate this interpretation of probabilities is the Dutch book used as a guide to how an agent (observer) should update its probabilities. I admit I find this interpretation a bit hard to digest — I cannot help but think of betting as a highly irrational procedure, and in general I would be hard-pressed to bet on any precise number or probability; the Born rule makes very precise predictions and ascribing them to an agent’s subjective belief feels like a bit of a stretch. Quite possibly it is also a problem of language, and I am not an expert on QBism (even the “usual” Bayesianism is far from obvious to me). Ruediger’s slides are very clear and give a good introduction for anyone interested in digging further.

The last talk of the day was given by Carl Miller, on “The extremes of quantum random number generation”. Carl explained how two very different principles enabled him and Yaoyun Shi to derive two independent lower bounds, each of interest in a separate regime, on the amount of randomness that can be generated from a pair of quantum devices in the sequential scenario: the devices are used repeatedly in sequence and one is interested in the min-entropy rate of one of the device’s outputs, as a function of the average observed violation of (in this case) the CHSH inequality: as the violation increases to the maximum the rate will improve to 1, and the question is how high a rate can be maintained is the violation is be bounded away from the optimum.

Carl’s first principle of measurement disturbance leads to a good rate in the high error regime, i.e. when the observed violation deviates significantly from optimum, say it is around {0.8} (for an optimum of {\approx .85}). The general idea is that a violation of the CHSH inequality necessarily implies the use of “non-classical” measurements, which do not commute (in fact a maximal violation requires anti-commuting measurements), by Alice’s device. Two non-commuting measurements do not have a common eigenstate, hence whatever the state of the device at least one of Alice’s measurements will perturb it — the outcome of that measurement on the state cannot be deterministic, and randomness is produced.

Of course this is just the intuition, and making it quantitative is more challenging. In their paper Carl and Yaoyun formalize it by introducing a new uncertainty relation (see this very recent survey on the topic for much more) expressed in terms of Renyi {\alpha}-entropy for {\alpha > 1}. The proof relies on strong convexity of the underlying Schatten-normed space of matrices, and Carl gave some geometric intuition for the uncertainty relation. The use of {\alpha}-Renyi entropy is important for the inductive step; it allows to bypass the lack of good chain-rule like arguments for the min-entropy.

The second principle introduced in Carl’s talk is self-testing. This is the idea that a device demonstrating a close-to-optimal violation of the CHSH inequality must have a certain rigid structure: not only does it generate random bits, but the device’s outcomes must be produced in a very specific way. Specifically the device’s state and measurements must be equivalent, up to local isometries, to those used in the “canonical” optimal strategy for the CHSH game. This principle is well-known (see e.g. the paper by McKague for a proof), but usually one only expects it to be of use in the “tiny error” regime, i.e. when the observed violation is of order {opt - \varepsilon} for small epsilon. Carl manages to squeeze the techniques further, and by carefully fine-tuning the argument is able to get good bounds on the randomness produced when the violation is large but still a constant away from optimum, say {0.83} or so. In that regime the bound is better than the one obtained from the first principle; I’ll refer you to Carl’s slides for the precise curves.

After Carl’s talk, the mind full of new ideasIMG_0018-1024x683 we left for a little walk to the Lanzerac wine estate, where we were shown around the wine cellar — including the mandatory wine
tasting of course, with the highlight being the local pinotage — and finished off the day with a nice dinner on the terrasse: it’s late October in South Africa, summer is coming!

Posted in Conferences, device independence, Quantum, Talks | Tagged , , | Leave a comment

Nonlocal games and operator spaces: a survey

Over the past few months, together with Carlos Palazuelos I wrote a survey article on “Nonlocal Games and Operator Space Theory”, and we just uploaded it to the arXiv. The survey is an invited contribution to a special issue of JMP on Operator Algebras and Quantum Information Theory which will hopefully appear soon.

Exquisite(ly painful) memories

We had a lot of fun writing the survey. Teaming up with an expert in functional analysis behind probably more than half of the impressive range of results in quantum information theory that the “operator space approach” has so far led to gave me a great opportunity to solidify some of the mathematical material I have labored through in recent years.

It all started during a memorable summer of 2009 spent visiting Harry Buhrman’s group at CWI in Amsterdam. At the time the breakthrough by Perez-Garcia, Wolf, Palazuelos (yes, the same Palazuelos!), Villanueva and Junge demonstrating the existence of tripartite Bell inequalities with unbounded violation was all fresh. The result generated a lot of attention, first because it is a great and unexpected result (contrast with the fact that the maximum violation of bipartite inequalities is uniformly bounded, by Grothendieck’s constant), and second because the proof technique was completely novel, and seemed to rely crucially on “deep arguments from operator space theory”, a relatively young theory (barely younger than quantum information, but then time flows more slowly in mathematics) developed by operator algebraists starting with the work of Ruan in the late 1980s.

I spent probably a full month banging my head against the paper. After a month, I had gained a good grasp of just one thing: when they say deep arguments, they mean deep. Note the comment added for v2 of the paper on arXiv: “Substantial changes in the presentation to make the paper more accessible for a non-specialized reader”. You bet: that comment was already there in summer 2009, and I guarantee it didn’t make me feel any better! I printed the paper, borrowed all referenced books from the library, and tried. Then put it down, out of despair. Then tried again. Then put it down. Picked it up again. To the hell with it, put it down, tried to find my own proof. Failed. Back to the paper. It was a wild chase…I kept thinking I was starting to see how to go about things, only to fail — deeply.

Cutting a long story short, we did get a couple of things out of that summer. The paper by Perez-Garcia et al. [PGWPVJ] covers a lot of ground. The main result is an existential proof of a family of tripartite Bell correlation inequalities, or equivalently a family of three-player XOR games, such that the maximum advantage of quantum players sharing entanglement over classical players using only shared randomness, as measured by the ratio of the corresponding maximum biases (bias = deviation of acceptance probability from {1/2}), is unbounded: it grows with the size of the games. A second result proved in [PGWPVJ] is that achieving any unbounded violation requires the players to share a relatively complex entangled state: GHZ states {|\psi\rangle= \frac{1}{\sqrt{d}} \sum_i |i\rangle|i\rangle|i\rangle} (of any dimension {d}), arguably a natural generalization of bipartite maximally entangled states, can only lead to a violation bounded by a universal constant.

After much suffering we were able to obtain generalizations and simplifications of both results — stripping out virtually all the operator space theory! First, with Jop Briet, Harry Buhrman and Troy Lee we extended the second result and showed that the maximum violation remains bounded for a larger class of states, including what we called “clique-wise entanglement” — combinations of EPR pairs, GHZ states, etc., as well as “Schmidt states”, states of the form {|\psi\rangle = \sum_i \alpha_i |i\rangle|i\rangle|i\rangle} for arbitrary coefficients {\alpha_i}. While this may seem innocuous, it turns out the extension to Schmidt states solves (via a reduction proved in [PGWPVJ]) an open question in operator algebras from the 1970s! The observation won us a fancy-sounding publication (“All Schatten spaces endowed with the Schur product are Q-algebras”) in a fancy-sounding journal (“Journal of Functional Analysis”). The result also has an interesting, completely classical consequence for the direct sum problem: it implies that there exists a three-player XOR game {G} whose bias is bounded away from {1} by a constant, but the bias of the XOR game obtained by taking the direct sum of {k} copies of {G} (the game is played {k} times in parallel and the parity of all answers is checked against the correct parity) remains bounded away from {0}, for all {k} — a strong counterexample to parallel repetition (for the direct sum) of three-player games.

Second (and with a couple years’ more effort), with Jop Briet we managed to re-derive the main result of [PGWPVJ], again with some improvements: while still probabilistic our construction is explicit (we give a simple distribution on games such that the large violation holds with high probability), and the dependence of the violation on the game size as well as the entanglement dimension are greatly improved. Arguably our construction borrows its key ideas from whatever we could squeeze out of [PGWPVJ], but there are no more operator spaces! The most fancy tool used in the construction is a tail bound for second-order Gaussian chaos due to Latala.

The operator space connection

So, does operator space theory really provide a natural framework for the study of Bell inequalities, or does the heavy machinery systematically hide observations that could be obtained by easier means? While the range of results obtained by Perez-Garcia et al. using their techniques (including large violations for bipartite  (non-XOR) inequalities and applications to quantum Shannon theory) seems to indicate that the tools are useful, it does not preclude that similar results could be obtained without having to resort to such high levels of obfuscation.

Since then I have grown more and more fond of the approach, to the point where I would in some cases agree that it provides the “right” point of view on a question about Bell inequalities (or nonlocal games) — a conclusion I would never have been able to reach without the invaluable help of David, Carlos, Marius, and my co-authors in these endeavors, Jop and Oded, whom I gratefully acknowledge! An example where I would say that the operator space viewpoint is undeniably the “right” one arises in work with Oded Regev on “Quantum XOR games” that grew out of our exploration of non-commutative extensions of Grothendieck’s inequality. For me a highlight of this work has been a derivation of the operator space Grothendieck inequality based on elementary techniques strongly inspired by the use of embezzlement states in quantum nonlocality.

These results, and many more, are described in the survey. While the JMP special issue is targeted at mathematicians, we tried to make the survey as self-contained as possible, and my hope is that it provides an easily approachable introduction to the delights of tensor norms, completely bounded maps and the associated “o.s.s.” for the interested quantum information theorist with little or no background in {C^*}-algebras or operator space theory.

An example

I’d like to end with a bird’s-eye view of the operator space point of view on entangled games by taking an example from the survey, an upper bound on the largest possible ratio between the maximum success probabilities achievable by quantum entangled and classical players respectively in a two-player game, as a function of the number of answers per player in the game. For a formal treatment I refer to Proposition 4.5 in the survey; here I will stay at the level of symbols, hoping to carry some of the flavor across, but please keep in mind that the exposition below is not fully accurate.

Games as tensors and values as norms

The functional analytic approach views a game as a tensor {G} on the space {({\mathbb C}^N\otimes {\mathbb C}^K)\otimes ({\mathbb C}^N\otimes {\mathbb C}^K)}. Here {N} and {K} are respectively the number of questions and answers to and from each player, and each coefficient {G_{x,y}^{a,b}} of the tensor corresponds to the associated payoff to the players in the game. The main thrust of the approach is that different value we are interested in — the classical value, or maximum success probability of classical players in the game, and the entangled value, or maximum success probability of quantum players sharing entanglement — are directly related to the norm of {G} when the underlying spaces are normed in the appropriate way, as Banach spaces (for the classical value) or operator spaces (for the entangled value).

Let’s consider the classical value first. To capture it we’ll turn the vector space {({\mathbb C}^N\otimes {\mathbb C}^K)\otimes ({\mathbb C}^N\otimes {\mathbb C}^K)} into the Banach space {\ell_1^{N}(\ell_\infty^{K})\otimes_\epsilon \ell_1^{N}(\ell_\infty^{K})}. What does this mean? Let’s first focus on one of the factors, {\ell_1^{N}(\ell_\infty^{K})}. We’re using the {\ell_\infty} norm for the space {{\mathbb C}^K} associated to answers, and the {\ell_1} norm for the space {{\mathbb C}^N} associated to questions. This choice is consistent with the interpretation of the game tensor as an object dual to the player’s strategies: glossing over considerations of non-negativity, a (classical) strategy is a collection of numbers {\{f_x^a\}} such that

\displaystyle \big\|\{f_x^a\}\big\|_{\ell_\infty^N(\ell_1^K)} = \max_x\,\sum_a |f_x^a| \leq 1.

Once a norm has been defined on the spaces associated with each player’s strategy, it remains to put them together into a norm on the full game tensor, which lives on the tensor product of the two spaces. In general (including here) there will not be a unique way to norm the algebraic tensor product of two Banach spaces in a way that is consistent with the two Banach norms. (Indeed, the study of all possible such norms was pioneered by Grothendieck.) For our purposes it turns out that the smallest such norm, the injective norm {\|\cdot\|_\epsilon}, is the right choice: the norm {\|G\|_{\ell_1^{N}(\ell_\infty^{K})\otimes_\epsilon \ell_1^{N}(\ell_\infty^{K})}} exactly coincides with the maximum success probability of classical players in game {G}. This is an easy calculation from the definition of the injective norm, but I’ll skip it here.

So that’s for classical strategies. Quantum strategies are a bit more complicated: instead of sequences of numbers {\{f_x^a\}} they are represented by sequences of (positive semidefinite) matrices {\{A_x^a\}}, of any dimension, such that {\sum_a A_x^a = \mathbb{I}} for all {x}. This is where operator spaces come in. An operator space is a structure that can be built on top of a Banach space by defining not just one norm but a sequence of matrix norms, on matrices of arbitrary dimension {d=1,2,\ldots,} with entries in the Banach space. Here the most natural operator space structure on {\ell_\infty^N(\ell_1^K)} replaces {\ell_\infty^N} by {M_N} ({N\times N} matrices with the operator norm) and {\ell_1^K} by {S_1^K}, whose operator space structure is a bit hard to define directly but can be obtained naturally via duality with {M_K} ({S_1^K} does not correspond to the Schatten-{1} norm; note that here we need an operator space, which provides much more structure than a Banach space). Once again I’ll skip the details: you’ll have to believe me that the resulting operator space structure, that I’ll continue writing as {\ell_1^N(\ell_\infty^K)}, precisely captures the notion of quantum strategy, i.e. a family of POVMs. (The “precisely” here is somewhat blunt; in particular there are subtleties involved with e.g. dealing with the requirement that POVM elements be positive semidefinite, but it is possible to work around those.)

Next we need to norm the tensor product, again as an operator space. As it turns out, the injective norm that worked so well for classical strategies has a direct operator space analogue, the minimal norm, which perfectly captures the quantum value of the game tensor! (I won’t show why, but just as for the injective norm it’s a simple calculation from the definition.) I’ll write {\|G\|_{\ell_1^{N}(\ell_\infty^{K})\otimes_{min} \ell_1^{N}(\ell_\infty^{K})}} for this norm, and from now on it’ll serve as a proxy for the entangled value of the game {G}.

That’s a lot of “coincidences”, isn’t it? A norm on the tensor product of Banach spaces matches the classical value of a game tensor, while the natural operator space analogue of that norm matches the quantum value. Now you may start to see why someone well-versed in the theory of such norms might see them as the “right” way to look at classical, and quantum, strategies. More details on this point in Section 2 of the survey.

A bound on the largest violation

But let’s move on. With the basic normed spaces in place the task of relating the classical and entangled values of a game can be formulated as follows: it amounts to estimating the norm of the identity map (as an aside, almost all results in functional analysis, or at least those that connect to the study of nonlocal games, that I have encountered seem to reduce to studying the “norm of the identity”… a fact that still makes me smile every time I stumble upon it: why isn’t the identity trivial?)

\displaystyle id:\,\ell_1^{N}(\ell_\infty^{K})\otimes_{\epsilon} \ell_1^{N}(\ell_\infty^{K})\rightarrow \ell_1^{N}(\ell_\infty^{K})\otimes_{min} \ell_1^{N}(\ell_\infty^{K}).

Indeed, as we have seen the norm of the game tensor when seen as an object on the first space corresponds to its classical value, while the norm of the same tensor considered as an object on the second space matches its entangled value. Any upper bound on the norm on the identity is an upper bound on the largest (multiplicative) advantage that can be gained by quantum players, while any lower bound will imply the existence of a game for which the advantage can be just as large.

Our strategy for bounding the norm of {id} is to decompose it as a sequence of three simpler maps, on which it is easy to obtain norm estimates. We’ll start from the right-hand side, and consider first the arrow

\displaystyle \ell_1^{NK}\otimes_{min} \ell_1^{NK}\rightarrow \ell_1^{N}(\ell_\infty^{K})\otimes_{min} \ell_1^{N}(\ell_\infty^{K}).

If the map is taken to be the identity, the condition {\max_{x} \|\sum_a A_x^a\|\leq 1} only gives us {\max_{x,a} \|\sum_a A_x^a\|\leq 1}, thus the norm of the identity is {1}. We can do better: consider the Fourier transform

\displaystyle \mathcal{F}: \,\{A_{x}^a\}_{x,a}\mapsto \Big\{ E_{x,k} = \frac{1}{\sqrt{K}}\sum_a e^{\frac{-2i\pi ka}{K}} A_{x}^a\Big\}_k, x=1,\ldots,N.

It is not hard to verify that if for every {x}, {\{A_x^a\}_a} is a POVM then for every {x,k} it will hold that {\| E_{x,k} \|\leq K^{-1/2}}: accounting for both players we saved a factor {K^{-1}}.

The next arrow we consider is

\displaystyle \ell_1^{NK}\otimes_{\epsilon} \ell_1^{NK}\rightarrow \ell_1^{NK}\otimes_{min} \ell_1^{NK}.

This is a key step: we are switching from the {\epsilon} to the {min} norm, from classical to quantum. But note how now the spaces are just {\ell_1}: only questions, no answers — but there are signs, and what we are looking at is precisely the setting of XOR games, i.e. Grothendieck’s inequality! Therefore for this map we can simply take the identity, and its norm will be at most Grothendieck’s constant {K_G}.

Finally, for the last arrow we consider

\displaystyle \ell_1^{N}(\ell_\infty^{K})\otimes_{\epsilon} \ell_1^{N}(\ell_\infty^{K})\rightarrow \ell_1^{NK}\otimes_{\epsilon} \ell_1^{NK}.

In order for the three arrows to compose to {id} we should take this map to be the inverse of the Fourier transform {\mathcal{F}} used in the first step. But note that now we are working with the {\epsilon} norm, or in other words classical strategies. Thus it suffices to bound

\displaystyle {\max_x\sum_a K^{-1/2}|\sum_{k} e^{\frac{2i\pi ak}{K}} t_{x,k}|}

for {t_{x,k}} such that {\max_{x,k} |t_{x,k}|\leq 1}, and by Parseval this is easily seen to be at most {K}. Hence our first map has norm {K^2}.

There we are! We managed to write the identity {id}, which maps the game tensor from an object acting on classical strategies to one acting on quantum strategies, as the composition of three maps with norms at most {K^2}, {K_G} and {K^{-1}} respectively. As a direct consequence, we’ve proven that the largest violation achievable by quantum strategies is at most {K_G\cdot K}, i.e. it is at most a constant times the number of possible answers in the game (warning: some constant factors missing, from e.g. glossing over complex vs real vs non-negative issues in the above).

What I like about this proof is that, from the point of view of operator spaces the above sketch is very natural (at least if I am to believe Carlos…), almost an exercise: as already mentioned, estimating the norm of the identity from one space to the other is the bread and butter of functional analysts, and they certainly have more than one trick up their sleeve. Unwinding the argument to frame it as a rounding procedure that transforms quantum entangled strategies into classical ones reveals some surprising tricks (e.g. the Fourier transform, which is used here to perform the actual quantum-to-classical rounding at the optimal step, the middle arrow) that one might not have thought of based solely on the usual “quantum toolbox”.

I’ll refer you to the survey for much more! All comments are very welcome and much appreciated; even typos: there is still plenty of time to fix those before the survey makes its way to the JMP presses.

Posted in Publications, Quantum, Science | Tagged , , | 3 Comments

Foundations of randomness, remainder of Day 1

Today I will cover the remaining talks from the first day of the workshop. Valerio’s opening talk was followed by a stimulating blackboard talk by Renato Renner, who asked the question “Is the existence of randomness an axiom of quantum physics”? Randomness is usually viewed as entering quantum mechanics through the Born rule, which is stated as an axiom describing the outcome distribution of any given measurement on a quantum state; there is no standard derivation of the rule from other axioms. This is deeply unsatisfying, as the rule appears completely arbitrary, and indeed many attempts have been made at deriving it from more basic structural constraints such as the Hilbert space formalism underlying Gleason’s theorem, or assumptions based on decision theory (Deutsch) or logic (Finkelstein).

Renato and his student Daniela Frauchiger have a proposal whose main thrust consists in demarcating what could be called the “physical substrate” of a given theory (such as quantum mechanics) from its “information-theoretic substrate”. In short, the physical substrate of a theory governs which experiments are “physical”, or possible; it is a completely deterministic set of rules. The information-theoretic substrate provides an additional layer which considers assignments of probabilities to different experiments allowed by the physical substrate, and specifies rules that these probabilities should obey.

Let me try to give a little more detail, as much as I can recall from the talk. Let’s call the framework suggested by Renato the “FR proposal”. The first component in the proposal is a definition of the physical substrate of a theory. In the FR proposal a theory bears on objects called stories: a story is any finite statement describing an experiment, where “experiment” should be interpreted very loosely: essentially, any sequence of events with a set-up, evolution, and outcome. For instance, a story could be that a pendulum is initialized in a certain position, left to swing freely, and after time {t} is observed and found to be in a certain position. Or it could specify that a quantum particle is initialized in a certain state, say {\ket{+}=\frac{1}{\sqrt{2}}(\ket{0}+\ket{1})}, then measured in the computational basis {\{\ket{0},\ket{1}\}}, and that the outcome {\ket{0}} is observed. Thus a story refers to a physical system and makes claims about observable quantities of the system.

Thus the physical substrate of the theory is a set of rules that completely determine which stories are “physical”, i.e. allowed by the theory, and which are not. For instance, the second story described above is a valid story of quantum mechanics: it is certainly possible that the outcome {\ket{0}} is obtained when the measurement described is performed. On the other hand quantum mechanics would rule out a story such as “the state {\ket{0}} is measured in the computational basis and the outcome {\ket{1}} is obtained”: this is simply not allowed. Note that up to this point the theory makes no claim at all on the “probability” of a story occurring; it just states whether the story is possible or not.

So much for the physical substrate. How do probabilities come in? This is the goal of the information-theoretic substrate of the theory. To introduce probabilities we consider mappings from stories to {[0,1]}. Any such mapping can be thought of as a measure of an observer’s state of knowledge, or uncertainty, about any particular story (much more on this and other interpretations of probabilities will be discussed in later talks, such as Carl Hoefer’s coming up next). A priori the mapping could be arbitrary, and the role of the information-theoretic substrate of the theory is to specify which mappings are allowed, or “reasonable”. For instance, in the measurement example from earlier most observers would assign that particular story a probability of {50\%}, which is what the Born rule predicts. But without taking the rule for granted, on what basis could we assert that {50\%} is the only “reasonable” probability that can be assigned to the story — why not {10\%}, or even {100\%}?

The main claim that Renato put forward is that the Born rule can be derived as a consequence of the following two broad postulates: (I apologize for the description of the postulates being rather vague; there are points about the formulation that I am unclear about myself!)

  1. The repetition postulate: experiments can be repeated, and if so lead to the same outcomes.
  2. The symmetry postulate: certain experiments have symmetries under which the outcome of the experiment is invariant. For example, an experiment which involves flipping three different coins in sequence and reporting the majority outcome is not affected by the order in which the coins are flipped.

The idea is that these postulates should be much easier to accept than the Born rule: certainly, if we want to say anything meaningful about certain probabilities being reasonable or not we ought to enforce some form of compatibility rules, and the two above sound rather reasonable themselves. The first seems a prerequisite to even make sense of probabilities, and the second follows the tradition started by de Finetti of basing the emergence of probability on indistinguishability of the outcomes of an experiment under certain symmetries of the experiment.

Unfortunately there does not yet seem to be a preprint available that describes these ideas in detail; hopefully it is coming soon. I’m quite curious to see how the proposal, that I find rather appealing, can be formalized.


After Renato’s talk — and an excellent lunch — we were treated to an intriguing talk on “Chance laws and quantum randomness” by Carl Hoefer. The talk addressed questions that we computer scientists or physicists rarely discuss openly (and indeed many of us might reserve for dreamy Sunday nights or inebriated Friday evenings). In Carl’s own words: “When we ascribe a primitive (irreducible) chance propensity, or postulate intrinsically random/chancy fundamental laws of nature, do we know what we’re saying?”. Or, less poetically put (but still in Carl’s words — Carl, thanks for making it accessible to us!): What does a claim such as “{\Pr(\text{spin-z-up} | \text{spin-x-up-earlier}) = 0.5}” mean, and how is it different from “{\Pr(\text{spin-z-up} | \text{spin-x-up-earlier}) = 0.7}”?

Let the question a chance to sink in — a good answer is not that obvious! It already arose in Renato’s talk, where probabilities were introduced as the reflection of an agent’s subjective belief. Well, this is the first of a number of possible answers to the question that Carl took head on and debunked, one slide at a time. (I am not sure how much I agree with Carl’s arguments here, but at least he showed us that there was certainly room for debate!) Other possible answers considered by Carl include probability as the reflection of an objective frequency for the occurrence of a certain event, or simply a “primitive fact” that would require no explanation. Needless to say, he found none of these interpretations satisfactory. It’s a good exercise to find arguments against each of them; see Carl’s slides for elements of answer (see also the article on Interpretations of probability on the Stanford Encyclopaedia of Philosophy for an extensive discussion).


A Galton board

So what is the speaker’s suggestion? I doubt my summary will make it much justice, but here goes: according to Carl probabilities, or rather “chance laws”, arise as a form of statistical averaging; in particular probabilistic statements can be meaningful irrespective of whether nature is intrinsically deterministic or irreducibly probabilistic. Specifically, Carl observes that many physical processes tend to generate a distribution on their observable outcomes that is robust to initial conditions. For instance, a Galton board will lead to the same Gaussian distribution for the location of the ball, irrespective of the way it is thrown in (except perhaps for an exceptional set of initial conditions of measure zero). Whether the initial conditions are chosen deterministically or probabilistically we can meaningfully state that “there is {x\%} chance that the ball will exit the board at position {z}”.

Carl concluded his talk by tying it to the Bohmian interpretation of quantum mechanics — almost, though perhaps not quite, going as far as to frame deterministic Bohmian mechanics as the most reasonable way to interpret probability in quantum mechanics. I found the discussion fascinating: I had never considered Bohmian mechanics (or, for that matter, the question of the origin of probabilities in quantum mechanics) seriously, and Carl made it sound, if not natural, at least not wholly unreasonable! In particular he made a convincing case for the possibility of ascribing meaning to “probability” in a fully deterministic world, without having to resort to the need for “subjective beliefs” or related rabbits such as high sensitivity to initial conditions — indeed, here it is quite the opposite.


The last talk of the day was delivered on the whiteboard, by Marcin Pawlowski. Marcin gave an intuitive derivation of the quantum bound on device-independent randomness generation from the CHSH inequality, the famous (at least for those of us versed in these things… see Appendix A.1 here for a derivation)

\displaystyle H_\infty(A|X)\geq 1-\log_2\Big(1+\sqrt{2-\frac{I^2}{4}}\Big) ,

based solely on a monogamy bound due to Toner. Toner introduced a “two-out-of-three” variant of the CHSH game in which there are three players Alice, Bob and Charlie, but the game is played either between Alice and Bob, or Alice and Charlie, chosen uniformly at random by the referee. Although in principle the maximum success probability of the players in this game could be as high as that of the two-player CHSH (since in the end only one instance of the game is played, the third player being ignored), Toner showed that the fact that Alice did not know a priori whom of Bob and Charlie she was going to play with places strong constraints on her ability to generate non-local correlations with at least one of them. Quantitatively, he showed that the players’ maximum success probability in this game is precisely {3/4}, whether the players are allowed entanglement or not: the presence of the third player completely eliminates the quantum advantage that can be gained in the usual two-player CHSH game!

The derivation presented by Marcin is simple but enlightening (good exercise!). It could be considered the “proof from the book” for this problem, as it makes the intuition fully explicit: the reason Alice’s outcomes are unpredictable, from the point of view of Eve (=Charlie), is that they are strongly correlated with Bob’s outcomes (up to the Tsirelson bound); monogamy implies that correlations with Eve’s outcomes could not simultaneously be as strong.


Typical view from a Stellenbosch winery

That’s it for Day 1 – time to relax, enjoy a good bottle of local wine, and get ready for the second day!

Posted in Conferences, device independence, Quantum, Science, Talks | Tagged , | 1 Comment