Quid qPCP?

This blog has already seen three posts on the quantum PCP conjecture: in February 2013 to highlight several talks at the Simons Institute in Berkeley, in October 2013 to promote a survey on the topic I wrote with Dorit Aharonov and Itai Arad, and in October 2014 to introduce the two main variants “constraint satisfaction” (CSP) and “multiplayer games” (MIP), of the quantum QCP (qPCP) conjecture. Such a high rate of posting (compared to the average frequency of posts on this blog) might indicate a slight obsession. But you may also notice it’s been…two years! Has no result worthy of note been established since? Certainly not, and although the conjecture still stands strong, there have been a few interesting developments on both variants of the conjecture. In this post I’ll discuss a couple results on the CSP-qPCP. In a follow-up post I’ll describe progress on the MIP-qPCP.

When we wrote the survey three summers ago, the latest word on the CSP-qPCP (see Conjecture 1.3 here for a precise formulation) had been given in a paper by Brandao and Harrow. BH showed, using information-theoretic arguments, that the constraint graphs associated with constant-gap QMA-hard instances of the local Hamiltonian problem had to satisfy “non-expansion” requirements seemingly at odds with the expansion properties of graphs associated with what are often considered the hardest instances of classical CSPs. Intuitively, their argument uses the monogamy of quantum correlations to argue that highly expanding constraint graphs place such strong demands on entanglement that there is always a product state whose energy is not far from the minimum. Although not strictly a no-go result, their theorem indicates that QMA-hard instances must be based on constraint graphs with markedly different spectral properties than those associated with the hardest instances of classical CSP.

For the time being it seems like any proof, or disproof, of the conjecture remains out of reach. Instead of focusing directly on qPCP, it may be more fruitful to develop the objects that are expected to play an important role in the proof, such as (quantum) low-density parity check codes (qLDPC) and (quantum) locally testable codes (qLTC). Two recent works make progress on this front.

The NLETS conjecture

The no low-energy trivial states (NLTS) conjecture was proposed by Freedman and Hastings as a “complexity-free” analogue of CSP-qPCP. The NLTS conjecture states that there exist local Hamiltonians such that all low-energy (within an additive constant, times the norm of the Hamiltonian, from the minimum) states are “non-trivial”, in the sense that they cannot be generated by a constant-depth quantum circuit applied on a product state. Equivalently, all states that are the output of a constant-depth quantum circuit must have energy a constant above the minimum. NLTS Hamiltonian are good candidates for qPCP as they provide local Hamiltonian for which many obvious classical certificates for the minimal energy of the Hamiltonian (such as the description of a small circuit which generates a low-energy state) are essentially ruled out.

An earlier version of the Eldar-Harrow manuscript claimed a construction of NLTS Hamiltonian, but the paper was recently updated, and the claim retracted. The current manuscript establishes a moderately weaker (though strictly incomparable) result, that the authors call NLETS, for “no low-error trivial states”. The main result of EH is a relatively simple, explicit construction of a family of local Hamiltonians that have no non-trivial “ground state ${\varepsilon}$-impostor”. An ${\varepsilon}$-impostor is a state that has the same reduced density matrix as a ground state on a fraction ${(1-\varepsilon)}$ of the qubits, but may differ arbitrarily on the remaining ${\varepsilon}$ fraction. Using that the Hamiltonian is local, impostors necessarily have low energy, but the converse is not true, so that NLETS rules out non-triviality for a more restricted class of states than NLTS. For that restricted class of states, however, the non-triviality established by EH is sronger than required by NLTS: they show that no ${\varepsilon}$-impostor can even be well-approximated (within inverse-polynomial trace distance) by logarithmic-depth, instead of just constant-depth, quantum circuits.

Let’s see if I can give some basic intuition on their construction; for anything substantial see the paper, which gives many angles on the result. Consider first first a classical repetition code encoding ${1}$ bit into ${n}$ bits. This can be made into a locally testable code by enforcing pairwise equality of bits along the edges of a constant-degree expanding graph on vertex set ${\{1,\ldots,n\}}$. Now allow me a little leap of faith: imagine there existed a magic quantum analogue of this classical repetition code, where equality between pairs of qubits is enforced not only in the ${Z}$ (computational) basis, but also in the ${X}$ (Hadamard) basis. Of course such a thing does not exist: the constraints would force any pair of qubits (linked by the expander) to form an EPR pair, a requirement that strongly violates monogamy. But let’s imagine. Then I claim that we would essentially be done. Why? We need two more observations.

The first key observation made by EH is that any ground state of this imaginary code would have the following property: if you measure all qubits of the state in the same basis, either ${X}$ or ${Z}$, then for at least one of the two possible choices the measurement outcomes will be distributed according to a distribution on ${n}$-bit strings that places a large (constant) weight on at least two well-isolated (separated by at least the minimum distance) subsets of the Hamming cube. Note that this does not hold of the classical repetition code: the distribution which all-${0}$ codeword is, well, concentrated. But if we were to measure the associated quantum state ${|0\cdots 0 \rangle \simeq \frac{1}{\sqrt{2}}( |+\cdots+\rangle+|-\cdots-\rangle)}$ in the Hadamard basis, we would get a very spread distribution, with constant mass on two sets that are at distance ${n}$ apart (I realize the equation I wrote is not quite correct! Don’t think too hard about it; obviously my “magical quantum repetition code” does not exist). The reason the distribution obtained in at least one of the two bases must be spread out is due to the uncertainty principle: if the distribution is localized in the ${X}$ basis it must be delocalized in the ${Z}$ basis, and vice-versa. And the reason it should be concentrated on isolated clumps is that we are measuring a codeword, which, for our magic example, can only lead to outcomes that are supported on the set ${\{|0\rangle^{\otimes n},|1\rangle^{\otimes n},|+\rangle^{\otimes n},|-\rangle^{\otimes n}\}}$.

To conclude we need the second observation, which is that trivial states do not have this property: measuring a trivial state in any product basis will always lead to a highly expanding distribution, which in particular cannot have large mass on well-isolated subsets. This is obviously true for product states, and requires a bit of work to be carried through logarithmically many layers of a quantum circuit; indeed this is where the main technical work of the paper lies.

So the argument is complete…except for the fact that the required magic quantum repetition code does not exist! Instead, HE find a good make-do by employing a beautiful construction of quantum LDPC codes due to Tillich and Zemor, the “hypergraph product”. The hypergraph product takes as input any pair of classical linear codes and returns a quantum “product” CSS code whose locality, distance and rate properties can be related to those of the original codes. The toric code can be case as an example of a hypergraph product code; see Section 3 in the paper for explanations. Unfortunately, the way the distance of the product code scales with other parameters prevents TZ from obtaining good enough qLDPC for the CSP-QPCP; they can “only” obtain codes with constant weight and constant rate, but distance ${O(\sqrt{n})}$.

In the context of NL(E)TS, and even more so qPCP, however, distance may not be the most relevant parameter. EH’s main construction is obtained as the hypergraph product of two expander-based repetition codes, which as a code only has logarithmic distance; nevertheless they are able to show that the robustness derived from the repetition code, together with the logarithmic distance, are enough to separate ${\varepsilon}$-impostors from logarithmic-depth trivial states.

Quantum LDPC & LTC

Quantum low-density parity-check codes (qLDPC) already made a showing in the previous sections. These families of codes are of much broader interest than their possible role in a forthcoming proof of qPCP, and constructions are being actively pursued. For classical codes the situation is largely satisfactory, and there are constructions that simultaneously achieve constant rate and linear distance with constant-weight parity checks. For quantum codes less is known. If we insist on constant-weight stabilizers then the best distance is ${\Omega(\sqrt{n}\log^{1/4} n)}$ (e.g. Freedman et al.), a notch above the TZ construction mentioned earlier. The most local construction that achieves linear distance requires stabilizers of weight ${O(\sqrt{n})}$ (e.g. Bravyi and Hastings).

A recent paper by Hastings makes progress on constructions of qLDPC – assuming a geometrical conjecture on the volume of certain surfaces defined from lattices in ${{\mathbb R}^n}$. Assuming the conjecture, Hastings shows the existence of qLDPC with ${n^{1-\varepsilon}}$ distance and logarithmic-weight stabilizers, a marked improvement over the state of the art. Although as discussed earlier even linear-distance, constant-weight, qLDPC would not imply the CSP-qPCP nor NLTS (the resulting Hamiltonian may still have low-energy eigenstates that are not at a small distance from codewords), by analogy with the classical case (and basic intuition!), constructions of such objects should certainly facilitate any attempt at a proof of the conjectures. Moreover, qLDPC suffice for the weaker NLETS introduced by EH, as the latter only makes a statement about ${\varepsilon}$-impostors, i.e. states that are at a constant distance from codewords. To obtain the stronger implication to NLTS, the proper notion is that of local testability: errors should be detected by a fraction of parity checks proportional to the distance of the error from the closest codeword (and not just some parity check).

Hastings’ construction follows the topological approach to quantum error correcting codes pioneered by Freedman and Kitaev. Although the latter introduced codes whose properties depend on the surface they are embedded in, at best I could tell the formal connection between homology and error correction is made in a comprehensive paper by Bombin and Martin-Delgado. The advantage of this approach is that properties of the code, including rate and distance, can be tied to geometric properties of the underlying homology, reducing the construction of good codes to that of manifolds with the right properties.

In addition to the (conjectural) construction of good qLDPC, almost as an afterthought Hastings provides an unconditional construction of a quantum locally testable code (qLTC), albeit one which encodes two qubits only. Let’s try to visualize this, starting from the helpful warm-up provided by Hastings, a high-dimensional, entangled, locally-testable code…which encodes zero qubit (the code space is one-dimensional). Of course this is trivial, but it’s a warm-up!

The simplest instance to visualize has six physical qubits. To follow the forthcoming paragraphs, take a piece of paper and draw a large tetrahedron. If you didn’t mess up your tetrahedron should have six edges: these are your qubits. Now the parity checks are as follows. Each of the four faces specifies an ${X}$-stabilizer which acts on the three edges forming the face. Each of the four vertices specifies a ${Z}$-stabilizer which acts on the three edges that touch the vertex. The resulting eight operators pairwise commute, and they specify a unique (entangled) state in the ${2^6}$-dimensional physical space.

Next we’d like to understand “local” testability. This means that if we fix a set ${O}$ of edges, and act on each of them using an ${X}$ error, then the resulting operator should violate (anti-commute) with a fraction of ${Z}$-stabilizers that is proportional to the reduced weight of the error, i.e. its distance to the closest operator which commutes with all ${Z}$-stabilizers. To see which stabilizers “detect” the error ${O}$, we recall that ${Z}$ and ${X}$ which overlap at an even number of locations commute. Therefore a ${Z}$ stabilizer will detect ${O}$ if and only if it lies in its boundary ${\partial O}$: the set of vertices which touch an odd number of edges in ${O}$. This is our syndrome; it has a certain cardinality. To conclude we need to argue that ${O}$ can be modified into a set ${O+P}$ with no boundary, ${\partial(O+P)=\emptyset}$, and such that ${P}$ is as small as possible – ideally, it should involve at most as many edges as the size of the boundary ${|\partial O|}$. Here is how Hastings does it: for each vertex in the boundary, introduce an edge that links it to some fixed vertex – say the top-most one in your tetrahedron. Let ${P}$ be the resulting set of edges. Then you can check (on the picture!) that ${O+P}$ is boundary-less. Since we added at most as many edges as vertices in the boundary (if the top-most vertex was part of the boundary it doesn’t contribute any edge), we have proven local testability with respect to ${X}$ errors; ${Z}$ errors are similar.

This was all in three dimensions. The wonderful thing is that the construction generalizes in a “straightforward” way to ${n}$ dimensions. Consider an ${(n+1)}$-element universe ${U=\{1,\ldots,n+1\}}$. Qubits are all subsets of ${U}$ of size ${q=(n+1)/2}$; there are exponentially many of these. ${Z}$-stabilizers are defined for each ${(q-1)}$-element subset; each acts on all ${(q+1)}$ of its ${q}$-element supersets. Symmetrically, ${X}$-stabilizers are defined for each ${(q+1)}$-element set; each acts on all ${(q+1)}$ of its ${q}$-element subsets. Thus the code is local: each stabilizer has weight ${(q+1)}$, which is logarithmic in the number of qubits. It remains to check local testability; this follows using precisely the same argument as above (minus the picture…).

This first construction encodes zero qubits. How about getting a couple? Hastings gives a construction achieving this, and remains (poly-logarithmically) locally testable. The idea, very roughly, is to make a toric code by combining together two copies of the code described above. The number of encoded qubits will become non-trivial and local testability will remain. Unfortunately, just as for the toric code, the distance of the result code only scales as ${\sqrt{n}}$. To construct his code Hastings uses a slightly different cellulation than the above-described one. I am not sure precisely why the change is needed, and I defer to the paper for more details. (Leverrier, Tillich and Zemor had earlier provided a construction, based on the TZ hypergraph product, with linear rate, square root distance, and local testability up to the minimal distance, i.e. for all errors of reduced weight at most ${O(\sqrt{n})}$.)

Although the geometric picture takes some effort to grasp, I find these constructions fascinating. Given the Brandao-Harrow objections to using the most “straighforward” expander constructions to achieve CSP-qPCP, or even NLTS, it seems logical to start looking for combinatorial structures that have more subtle properties and lie at the delicate boundary where both robustness (in terms of testability) and entanglement (non-triviality of ground states) can co-exist without challenging monogamy.

Posted in QPCP, Quantum, Science, Uncategorized | | 2 Comments

QCryptoX: post-mortem

Our EdX/Delft/Caltech course on quantum cryptography officially ended on Dec. 20th, and is now archived (all material should remain available in the foreseeable future; we will also prepare a self-contained version of the lecture notes). At Caltech, the “flipped classroom” ended a couple weeks earlier; by now all grades are in and the students may even be beginning to recover. How did it go?

1. In the classroom

Fifteen Caltech students, with a roughly equal mix of physics/CS/EE backgrounds, followed the course till the end (we started at ~20). We had a great time, but integration with the online course proved more challenging than I anticipated. Let me say why, in the hope that my experience could be useful to others (including myself, if I repeat the course).

The EdX content was released in 10 weekly updates, on Tuesdays. Since on-campus classes took place Tuesdays and Thursdays, I asked Caltech students to review the material (videos+lecture notes+quizzes) made available online on a given Tuesday by the following Tuesday’s class. I would then be able to structure the class under the assumption that the students had at least some minimal familiarity with the weeks’ concepts. This would allow for a more relaxed, “conversational” mode: I would be able to react to difficulties encountered by the students, and engage them in the exploration of more advanced topics. That was the theory. Some of it worked out, but not quite as well as I had hoped, and this for a variety of reasons:

1. There was a large discrepancy in the students’ level of preparation. Some had gone through lecture notes in detail, watched all videos, and completed all quizzes. Although some aspects of the week’s material might still puzzle them, they had a good understanding of the basics. But other students had barely pulled up the website, so that they didn’t even really know what topics were covered in a given week. This meant that, if I worked under the assumption that students already had a reasonable grasp of the material, I would loose half the class; whereas if I assumed they had not seen it at all I would put half the class to sleep. As an attempted remedy I enforced some minimal familiarity with the online content by requiring that weekly EdX quizzes be turned in each Tuesday before class. But these quizzes were not hard, and the students could (and did) get away with a very quick scan through the material.
2. As all students, but, I hear, even more so here, Caltech undergraduates generally (i) do not show up in class, and (ii) if per chance they happen to land in the right classroom, they certainly won’t participate. In an attempt to encourage attendance  I made homeworks due right before the Tuesday 10:30am class, the idea being that students would probably turn in homeworks at the last minute, but then they would at least be ready for class. Bad idea: as a result, students ended up spending the night on the homework, dropping it off at 10:29:59… only to skip class so as to catch up on sleep!Slightly over half of the registered students attended any given class, a small group of 8-10 on average. This made it harder to keep participation up. On the whole it still went pretty well, and with a little patience, and insistence, I think I eventually managed to instore a reasonably relaxed atmosphere, where students would naturally raise questions, submit suggestions, etc. But we did not reach the stage of all-out participation I had envisioned.

Given these observations on what went wrong (or at least sub-optimally), here are a few thoughts on how the course could be improved, mostly for my own benefit (I hope to put some of these to good practice in a year or two!). This should be obvious, but: the main hurdle in designing a “flipped classroom” is to figure out how to work with the online content:

• First there is a scheduling difficulty. Some students complained that by having to go through the weekly videos and lecture notes prior to the discussion of the material in class they simultaneously had to face two weeks’ worth of content at any given time. Scheduling of online material was decided based on other constraints, and turned out to be highly sub-optimal: each week was released on a Tuesday, which was also the first day of class, so that it was unreasonable to ask the students to review the material before that week’s classes….pushing it to the next week, and resulting in the aforementioned overlap. A much better schedule would have been to e.g. release material online on Friday, and then have class on Tuesdays and Thursdays. This would have led to a larger overlap and less schizophrenia.
• Then comes the problem of “complementarity”. What can be done in class that does not replicate, but instead enriches, then online material? This is made all the more difficult by the inevitable heterogeneity in the student’s preparation. An effort has to be made to limit this by finding ways to enforce the student’s learning of the material. For instance, each class could be kick-started by a small presentation by one of the students, based on one of the online problems, or even by re-explaining (or, explaining better!) one of the week’s more challenging videos. This should be made in a way that the students find it valuable, both for the presenter and the listeners; I don’t want the outcome to be that no one shows up for class.
• Student-led discussions usually work best. They love to expose their ideas to each other, and improve upon them. This forces them to be active, and creative. The best moments in the class where when the discussion really picked up and the students bounced suggestions off each other. The existence of the online material should facilitate this, by giving a common basis on which to base the discussion. My approach this time wasn’t optimal, but based on the experience I think it is possible to do something truly engaging. But it won’t work by itself; one really has to design incentive-based schemes to get the process going.

2. Online

Success of the online course is rather hard to judge. At the end of the course there were about 8000 officially registered students. Of these, EdX identified ~500 as “active learners” over the last few weeks (dropping from ~1500 over the first few weeks, as is to be expected). I think an active learner is roughly someone who has at least watched some parts of a video, answered a quizz or problem, participated in a forum, etc.
About 100 students pursued an official certificate, which means that they paid ~50\$ to have their success in the course officially registered. I couldn’t figure out how many students have actually “passed” the class, but I expect the number to be around 200: most of the certified students plus a few others who didn’t want to pay for the certificate but still turned in most homeworks. This is a fair number for a challenging specialized course, I am pretty happy with it. The high initial enrollment numbers, together with anecdotal evidence from people who got in touch directly, indicate that there certainly is demand for the topic.  The most active students in the course definitely “wanted in”, and we had lots of good questions on the forum. And, many, many typos were fixed!

How satisfied were the students with the course? We ran an “exit survey”, but I don’t have the results yet; I can write about them later (hoping that a significant enough number of students bother to fill in the survey). We also had  pre- and mid-course survey. Some of the more interesting questions had to do with how students learn. In my opinion this is the main challenge in designing a MOOC: how to make it useful? Will the students learn anything by watching videos? Anecdotal evidence (but also serious research, I hear) suggests not. Reading the lecture notes? Maybe, but that requires time and dedication – basically, to be an assiduous learner already. Just as “in-the-classroom” learning, it is the problem-solving that students are brought to engage in that can make a difference. Students like to be challenged; they need to be given an active role. In the mid-course survey many of the positive comments had to do with “Julia lab” assignments that were part of the course, and for which the students had to do some simple coding that let them  experiment with properties of qubits, incompatible measurements, etc. In the pre-course survey students also indicated a marked preference for learning via solving problems rather than by watching videos.

So a good online MOOC should be one that actively engages the student’s problem-solving skills. But this is not easy! Much harder than recording a video in front of a tablet & webcam. Even though I was repeatedly told about it before-hand, I learned the lesson the hard way: homework questions have to be vetted very thoroughly. There is no end to a student’s creativity in misinterpreting a statement – let alone 1000 students’. Multiple-choice questions may sound straightforward, but they’re not: one has to be very careful that there is exactly one straight correct answer, while at the same time not making it too obvious which is that answer; when one has a solution in mind it is easy not to realize that other proposed supposedly wrong solutions could in fact be interpreted as correct. The topic of cryptography makes this particularly tricky: we want the students to reason, be creative, devise attacks, but the multiple-choice limits us in this ability. Luckily we had a very dedicated, and creative, team of TAs, both in Delft and In Caltech, and by working together they compiled quite a nice set of problems; I hope they get used and re-used.

3. Conclusions

It’s too early (or too late) for conclusions. This was a first, and I hope there’ll be a second. The medium is a challenge, but it’s worth reaching out: we teach elite topics to elite students at elite institutions, but so many more have the drive, interest, and ability to learn the material that it would be irresponsible to leave them out. MOOCs may not be the best way to expand the reach of our work, but it is one way…to be improved!

It was certainly a lot of fun. I owe a huge thank you to all the students, in the classroom and online, who suffered through the course. I hope you learned a lot! Second in line were the TAs, at Caltech as well as Delft, who did impressive work, coping simultaneously with the heavy online and offline duties. They came up with a great set of resources. Last but not least, behind the scenes, the video production and online learning teams, from Delt and Caltech, without whose support none of this would have been made possible. Thanks!

Two weeks in

Our course on quantum cryptography will soon enter its third week of instruction (out of ten weeks planned). My parallel “in-class” course at Caltech has already gone through four weeks of lectures. How is it going so far?

There are many possible measures against which to evaluate the experience. An easy one is raw numbers. Online there are a bit over 7,200 students enrolled. But how many are “active”? The statistics tools provided by EdX report 1995 “active” students last week – under what definition of “active”? EdX also reports that 1003 students “watched a video”, and 861 “tried a problem”. What is an active student who neither watched a video nor tried a problem – they clicked on a link? In any case, the proportion seems high; from what I heard a typical experience is that about 2-5% of registered students will complete any given online course. Out of 7,000, this would bring the number of active students by the end of the course at at a couple hundred, a number I would certainly consider a marked success, given the specialized topic and technically challenging material.

At Caltech there are 20 students enrolled in CS/Phys 120. Given the size of our undergraduate population I also consider this to be a rather high number (but the hard drop deadline has not passed yet!). It’s always a pleasure to see our undergraduate’s eagerness to dive into any exciting topic of research that is brought to their attention. I don’t know the enrollment for TU Delft, but they have a large program in quantum information so the numbers are probably at least twice as high.

Numbers are numbers. How about enthusiasm? You saw the word cloud we collected in Week 0. Here is one from Week 2 (“What does “entanglement” evoke in you right now?”; spot the “love” and “love story”; unfortunately only 1% of responses for either!). Some of the students speak up when prompted for simple feedback such as this, but the vast majority remain otherwise silent, so that involvement is hard to measure. We do have a few rather active participants in the discussion forums, and it’s been a pleasure to read and try to answer their creative questions each day – dear online learners, if you read this, thanks for your extremely valuable comments and feedback, which help make the course better for everyone! It’s amazing how even as we were learning qubits some rather insightful questions, and objections, were raised. It’s clear that people are coming to the course from a huge range of backgrounds, prompting the most unexpected reactions.

A similar challenge arises in the classroom. Students range from the freshmen with no background in quantum information (obviously), nor in quantum mechanics or computer science, to more advanced seniors (who form the bulk of the class) to graduate students in Caltech’s Institute for Quantum Information and Matter (IQIM). How to capture everyone’s attention, interest, imagination? The topic of cryptography helps -there is so much to be fascinated with. I started the course by discussing the problem of quantum money, which has the advantage of easily capturing one’s imagination, and for which there is a simple quantum scheme with a clear advantage over classical (cherry on top, the scheme is based on the famous “BB84 states” that will play a major role in the class via their use for quantum key distribution). So newcomers to quantum information could learn about qubits, kets and bras, while others could fend off their impatience by imagining new schemes for public-coin quantum money.

This is not an easy line to thread however, and given the composition of the class I eventually decided to err on the side of caution. Don’t repeat it, but this is my first time even teaching a full class on quantum information, and the basic concepts, not to mention the formalism, can be quite tricky to pick up. So we’re going to take it slow, and we’ll see how far we get. My hope is that the “flipped classroom” format should help needy but motivated students keep afloat by making all the online material available before it is discussed in class. Since the online course has only been going on for a couple weeks I can’t yet report on how well this will work out; my initial impression is that it is not given that the in-class students actually do spend enough time with the online material. I am yet to find the proper way to incentivize this: quizzes? rewards? The best reward should be that they manage to follow the course 😉

In the coming weeks we’ll start making our way towards quantum key distribution and its analysis. Entanglement, measuring uncertainty, privacy amplification, BB84 and Eckert, and device independence. Quite a program, and it’s certainly nice to attempt it in such pleasant company!

One week later…

…and 736 words of expectation for the class:

Note the top contender: let’s see if we live up to their expectations!

It’s been a fun first week. We released “Week 0” of the material a month ahead of the official start date, so that those students not yet familiar with the basics of quantum information (what is a qubit?) would have enough time to digest the fundamental notions we’ll be using throughout. (Out of the ~5500 registered students, ~1250 are currently marked my EdX as “active” and 560 have watched at least one video. Keep it coming!)

An unexpected benefit of opening up the platform ahead of time is that it is giving us the time to experiment with (read: debug) some of the tools we plan to use throughout. A first is EdX’s own possibilities for interaction with the students, an example of which is pictured above (“Use the space below to enter a few words that best characterize your expectations for this class”).But we’re also using a couple add-ons:

The first is a system replacing EdX’s discussion forums,called Askalot. Cute name – how can you resist. The main benefit of Askalot is that it provides more structure to the discussions, which can be characterized as question/bug report/social discussion/etc, can be up/down-voted, marked as correct/invalidated by the instructors, etc. The students are having fun already, introducing themselves, complaining about bugs in the quizzes, and, of course, about Askalot itself! (Thanks go to Ivan Srba, one of the creators of Askalot, for being extremely responsive and fixing a host of minor bugs overnight – not to mention throwing in the extra feature requested by the students.)

A second is called DALITE. The idea of DALITE is to encourage students to provide an explanation justifying their answer to a question. Indeed, one of the drawbacks of the online platform is the need for automatic grading of assignments, which greatly limits how the student can be engaged in the problem-solving exercise, mostly limited to multiple-choice or simple numeric answers. DALITE (which grew out of, and is still, a serious research project in online learning) introduces a fun twist: the student is asked to type in a “rationale” for her choice. Of course there is no way we could grade such rationales. But here is the idea: once the student has entered her explanation, she is shown the rationale provided by another student (or the instructor) for a different answer, and asked whether she would like to reconsider her decision. The student can choose to change her mind, or stick with her initial choice; she is asked to explain why. It’s really fun to watch the answers provided (“What would happen if quantum mechanics allowed cloning of arbitrary states?”), the change of minds that take place, and the rationale that incentivized said change of mind. (Thanks to Sameer Bhatagnar for helping us set up the tool and explaining its many possibilities, and to Joseph Williams for suggesting its use in the first place.)

We’ll see how these pan out in the longer run. I’m definitely eager to experiment with ways to make the MOOC experience a better learning experience for the students. I’ll let you know how it goes. Suggestions welcome!

PS: 47% students 25 and under, 78% from outside US, is good, but 16% female is not…come on!

Coming to a theater near you

This Fall together with long-time friend and collaborator Stephanie Wehner I will be teaching a Caltech/TUDelft/EdX course on Quantum Cryptography. As I am writing this I almost feel like saying I have taught an EdX course…indeed, I just finished recording the last video for the course! Let me tell you right away that, when you’ll eventually get to see this fool absurdly gesticulating in front of at tablet, well – it was no small effort! (Not only mine: I’ll have a chance to return to this, but a serious hat tip to the Caltech production team is already in order.) This is the first post of a few in which I plan to share my experience preparing for, and then running, the course (which is officially starting on October 10th).

Why?

More often than not the first question I get asked is – why? Why did you decide to do this?? As a teacher you’re used to torturing a couple dozen students each semester… but now, not only do you want to ramp it up and torture thousands at a time, but you also want to torture yourself???

Let’s see. I can find at least two reasons. The first is rather selfish: I’m curious. I want to give it a try. My interest in the possibilities of online education, as is the case for many of us, started around 2011, during the “big boom” that followed the success of Sebastian Thrun and Peter Norvig’s Artificial Intelligence MOOC at Stanford. It is through discussing the experience of my Ph.D. advisor, Umesh Vazirani, who taught one of the early MOOCs (and still the only one on quantum computing), that I got more seriously interested in the medium. As everything he does, Umesh took the course deep at heart, and taught it passionately, striving to pack his signature “nuggets” of insight into the rigid format of a 10-minute video meant to be accessed “massively online”. Umesh had great hopes for the medium, and his course was wildly successful. It has certainly been cited as their main source of education in quantum information by many a Caltech Ph.D. applicant!

Recognize anyone?

I have the impression that the initial explosion of interest in MOOCs that took place in 2012-1014 has subsided somewhat, and the medium may be going through a phase of soul-searching: given that it is now clear MOOCs will not cure all the world’s ills, can they at least be useful for something? The “online education” medium certainly has its own challenges. As a friend working in the psychology of learning put it somewhat bluntly, “students don’t learn anything from videos – don’t waste your time” (too late, Joseph – I had started recording already). There is some truth to this (though as long as we don’t forget the 2x button there’s still a chance the students might condescend to skimming through some fraction of the videos in a quick hunt for hints towards the pset solutions).

The point though is that broadly interpreted “online education” should have much more potential for engaging the students than simply dumping video clips on them . This is what I’m interested in exploring: the extent to which it is possible to set up a stimulating, useful interactive process for the students, with some necessary “downtime” spent reading notes and working problems on their own, and some “uptime” spent watching videos but also participating in forums, thinking through stimulating questions, checking out other student’s answers, etc.

The second reason is perhaps less self-centered. I think we picked the right topic. A two-word title, each of which bound to associate with a whole range of exciting, if nebulous, concepts in the mind of any young apprentice-scientist, what is not to like? Quantum cryptography has recently (and somewhat justifiably) been drawing a lot of attention, from the expanding range of start-ups to the NSA through Chinese rockets. But surprisingly few resources are available to whomever seeks a serious introduction to the subject. While great books on quantum computing and quantum information are popping up, none gives much attention to quantum cryptography. is no book on quantum cryptography, few if any lecture notes (I searched, and the only consistent notes I could find are from a course taught by Unruh in 2014), few surveys (see this recent one by Broadbent and Schaffner, which however remains at a very high level), and a scientific literature that is not easy to navigate.

What?

So this is what Stephanie & myself set to do, share our enthusiasm for quantum cryptography with you, young enthusiasts! What will the course be about? If you care you should watch the promo video, sign up, and take the course!

We hope to use the ten weeks we have to take the absolute beginner in quantum information and cryptography to the level where she has a solid conceptual and mathematical understanding of quantum cryptography. A large chunk of the course will be devoted to the description and security analysis of protocols for quantum key distribution (QKD). This includes the thorny notions associated with measuring security with respect to quantum adversaries, and the recent paradigm of device independence. Beyond QKD, we also aim to give a broad overview of other feats of quantum cryptography, including primitives in two-party cryptography such as bit commitment or coin flipping, the noisy storage model, position-based cryptography, and delegated computation. Unfortunately we will not have enough time to cover quantum attacks on classical cryptosystems or post-quantum cryptography, leaving space for a companion “quantum cryptanalysis” course – volunteers?.

A word of caution: a solid background in linear algebra will be required for the course. Or at least, a will to put in the serious effort that will be required to follow the material. Making the course accessible does not mean dumbing it down, and the less mathematically inclined might find it challenging. The upshot, though, is that whomever sticks around will find it an intellectually rewarding experience, as the course will bring you to a stage where you’re basically ready to start doing research in the area. Not to scare everyone away: the course will start slow, with a “Week 0” of background material rolling out now, and the first couple weeks devoted to an introduction to the important notions of quantum information. My a priori assessment is that any 3rd or 4th year Caltech undergrad should have no problem taking the course. We’ll see how it plays out, but you should certainly try!

I will be teaching the course “inverted classroom” style at Caltech (as will Stephanie in Delft). The students will be asked to watch the videos, and more generally go through the online content, ahead of regular class meetings. In class, depending on the level of the students and their understanding of the material we will provide additional explanations or – as I hope will be possible – go further and develop our own projects extending the online material in directions of interest to the students.

This is a new experience for me, and I already stumbled through some of the beginners’ mistakes. But it’s fun! And, I hope, worth it. I’ll keep you updated. If anyone has experience (as I’m sure some of you do), unsolicited advice is more than welcome!

flour + water + salt + time =

And there goes my last worry about the possibility of life in California…check! Would never have imagined it’d be so easy.

Posted in random | Tagged | 2 Comments

Foundations of randomness, Day 3

Last day! Wednesday started with a bang. Yevgeniy Dodis spent the whole Tuesday evening (our workshop dinner at Lanzerac!) complaining that we weren’t leaving him enough time to prepare for his talk…only to spend an equal amount of time during his talk complaining that he wasn’t going to be able to cover half of the material he had prepared! (To view things positively, this says something about the level of interaction during the talks — Yevgeniy was flooded with questions; quite a show of force for a post-workshop dinner morning.)

Dodis – Randomness in cryptography

I regret not having saved a picture of the table Yevgeniy drew on the whiteboard for his talk. There were four columns: no randomness, weak randomness, local or public randomness, and perfect randomness. For each column Yevgeniy listed four cryptographic scenario: soundness (for which classes of computational problems are there sound interactive proof systems), authentication (secure implementation of primitives such as message authentication or digital signatures), privacy (tasks such as bit commitment, secret sharing, or even public- or private-key encryption) and extraction (is deterministic extraction of uniform randomness possible). The scenario are listed in increasing order of difficulty, and the question is how far each kind of randomness lets us go down the list.

The first and last columns are straightforward to fill in. In the first, corresponding to a world with no randomness, there is very little one can do. In terms of soundness it is possible to verify languages in NP, but no more; cryptographic tasks are clearly impossible (determinism does not allow for uncertainty — at least for this talk!), and certainly extraction does not make much sense. If, on the other hand, perfect randomness is available many feats can be accomplished. It is a milestone of complexity theory that all languages in PSPACE have interactive proof systems; the hard work of cryptographers over the past couple decades demonstrates an impressive range of cryptographic tasks that can be implemented with or without computational assumptions. And of course extraction is not a problem.

Things become interesting with the second column. Suppose given access to a source of randomness taken from a family of sources which have a guaranteed min-entropy rate, but from which it is not possible to deterministically extract uniformly random bits (such as Santha-Vazirani sources). Can such a source still be used to implement non-trivial cryptographic tasks?

Here Yevgeniy drew our attention to an interesting distinction between authentication and privacy. Indeed, the latter is much more demanding in terms of the “quality” of the randomness used. For an adversary to break an authentication task, such as a message authentication code (MAC), it is required that the adversary be able to forge a certain object — e.g. a fake signature for an unknown message for which another valid signature is known. In contrast, for the adversary to break a privacy task, such as encryption, it is sufficient for the adversary to distinguish between two distributions — encryptions of ${0}$ and encryptions of ${1}$. Thus at a high level it is not too surprising that it should be possible to implement authentication tasks in a way that is secure as long as the sources of randomness used to implement the task has a guaranteed min-entropy rate. Privacy tasks, on the other hand, seem to require much more structure from the random source. In fact, if one thinks about it the task of secure encryption is very closely related to that of extraction, and one may readily expect that privacy can only be achieved if the family of sources allows for deterministic extraction. This connection is made formal in a paper by Bosley and Dodis.

Returning to his second column, Yevgeniy introduced the notion of an ${(t,\delta)}$-expressive source, which informally is a class of sources such that for any two efficiently computable functions ${f}$ and ${g}$, if ${f}$ and ${g}$ differ on a significant (at least ${2^{-t}}$) fraction of inputs (measured under the uniform distribution) then there is a source in the family such that the expected statistical distance, when evaluated on that source, between ${f}$ and ${g}$ is at least ${\delta}$. That is, a family is expressive if it “separates” pairs of functions that differ significantly. Now the main results are:

1. Many common families of sources are expressive — including all families from which deterministic extraction is possible, but also e.g. Santha-Vazirani sources and various generalizations thereof.
2. No privacy task can be achieved if the only available randomness is taken from an (a priori unknown) source from an expressive family. Proof sketch: consider an encryption scheme. Of course encryptions of ${0}$ and encryptions of ${1}$ should be different most of the time (otherwise decryption will fail). Therefore by the definition of an ${(t,\delta)}$-expressive family, for say ${t=0}$ or ${t=1}$, there should be a source in the family under which both types of encryption can be efficiently distinguished — contradicting security.
3. Some authentication tasks, including MACs and digital signature schemes, can be implemented from some specific expressive families, such as Santha-Vazirani sources or their generalization block min-entropy sources.

The proof of the second point above is trivial, but the result is quite interesting, highlighting the importance of making the right definitions! I am not familiar at all with this area, but it seems that the notion of expressive family of sources is a meaningful one to have in mind. Now I only wish Yevgeniy had time to say something about the third column (public vs private randomness)…unfortunately it will have to be for another time.

Fawzi – Exams

The following talk was given by Omar Fawzi, soberly entitled “Exams” (last day…it had to come!). Omar considers the following scenario. Suppose an exam consists of ${n}$ possible questions ${X_i}$, and a student has some information, or knowledge, ${B}$ based on which she attempts to answer the questions that are asked to her. Suppose the student achieves a certain score when asked about a random subset of ${k}$ of the ${n}$ possible questions — technically, given indices ${\{i_1,\ldots,i_k\}}$ the student produces guesses ${A_{i_1}(B),\ldots,A_{i_k}(B)}$ such that ${A_{i_j}(B) = X_j}$ for a fraction ${\Delta}$ of indices ${j\in\{1,\ldots,k\}}$, on average over the (uniformly random) choice of the ${k}$ indices and the student’s randomness (including the random variable ${B}$ and the map ${A}$).

Can we deduce that the student simultaneously “knows about” a fraction ${\Delta}$ of all possible questions in the exam — if she was asked about all ${n}$ possible questions at once, would she typically achieve the same score as when asked only ${k}$ of them? To make this a little more precise and see why it is not obvious, consider the following simple scenario. The correct answers ${X_i}$ are either all ${0}$, or all ${1}$, each with probability ${1/2}$. The student’s information ${B}$ consists of two bits; the first bit ${B_1 = X_i}$ (for all ${i}$), and the second bit ${B_2}$ is an independent coin flip. The student’s strategy is to always answer ${B_1\oplus B_2}$. Note that this student is not “optimal”, as she has “good days” (${B_2=0}$) on which she gets everything right, and “bad days” on which she scores ${0}$. Our task is not to extract an optimal student, but to reproduce this particular student’s behavior at the scale of the whole exam: we want a strategy that gets the whole exam right half the days, and the whole exam wrong the remaining time — not a strategy that gets half the exam right on every day! So you see that the most natural “rounding” procedure, which would decompose the set of ${n}$ questions into ${(n/k)}$ groups of ${k}$ and look up the student’s answers to each group, will not work. One needs to take into account the randomness in ${B}$ and in the student’s strategy to extract a global strategy that is consistent with the local behavior.

The scenario is reminiscent of what are known as “direct product tests” in the PCP literature, and there is a long sequence of works on the topic, see e.g. Dinur and Reingold, Impagliazzo et al. and Dinur and Steurer. As far as I could tell there are differences that make the problems incomparable: for instance, direct product tests attempt to match two consistent local views to a global object, whereas here the exam is guaranteed to pre-exist; direct product tests consider complete agreement instead of fractional agreement ${\Delta}$ here; etc. But the problems are similar enough that you’d expect techniques used to approach the one could be useful for the other as well. For anyone interested, see Omar’s slides for a more precise formulation of the problem he considers.

The situation is even more interesting in the quantum setting. Suppose the exam remains classical (though one could consider quantum exams as well…), but the student’s information ${B}$ is a quantum state. Then all we know is that for every set of ${k}$ questions the student has a measurement on ${B}$ that produces answers with a certain success rate. Can these measurements be combined in a way to provide answers to all questions in the exam simultaneously? Since any measurement will necessarily affect the student’s state we can’t simply repeatedly measure ${B}$, and monogamy prevents us from taking many copies of the state (which couldn’t all simultaneously be correlated with ${X}$ in the appropriate way). The problem has a very similar flavor to the one faced when proving security of randomness extractors against quantum side information. Again there are subtle differences, but one may expect some of the techniques developed in that area (such as the work of Koenig and Renner on sampling-based extractors) to be useful here.

The day ended with two afternoon talks each highlighting a number of open questions in device-independent randomness certification, and some leeway that the speakers have recently achieved. This left us with a solid set of take-home problems to think about!

Pironio – Some open questions on DI randomness generation

The first talk was given by Stefano Pironio, who picked three of his favorite open questions in device-independent randomness, and had time to tell us about two of them in some detail.

Stefano’s first question asks whether it is possible to construct deterministic extractors for quantum sources. Although it is unclear whether this would have any application, it is a very natural question. The main observation is that the “raw” random bits that are produced in a device-independent randomness extraction procedure (say, the outputs of Alice’s device in a protocol which sequentially repeats the CHSH game) not only have a guaranteed min-entropy rate, but they also have additional structure that comes from the way they are produced. What this structure is, and whether it is sufficient to guarantee deterministic extraction, is precisely the content of the problem. Stefano showed us a specific inequality that he could show always holds, and hoped would suffice: if ${A_i\in\{-1,1\}}$ is the ${i}$-th bit output by the source, then the average of all correlators

$\displaystyle \frac{1}{2^n}\Big| \,\sum_{S\subseteq [n]}\, \big|\langle \Pi_{i\in S} A_i \rangle\big|\, \Big| \,\leq\, 2^{-\delta n},$

for some small ${\delta}$ depending on the CHSH violation observed. In fact Stefano claimed the inequality is already enough to show, via the probabilistic method, that a deterministic extractor should exist; thus it remains to find one…This is an appealing question, and I wouldn’t be surprised if it can be solved by the pseudorandomness experts out there!

The second problem Stefano presented has to do with the relation between randomness and entanglement. Suppose that the devices used in a randomness expansion procedure share ${n}$ “ebits”, or entangled bits. How many bits of randomness can be generated? This is both of theoretical and practical interest: entanglement is hard to come by, and of course we want to get the most out of whatever we have. All known protocols generate ${O(n)}$ random bits from ${n}$ ebits, but there is no reason for this to be a hard limit; intuitively entanglement is required for testing the devices, but less directly so for generating randomness (see also the next talk by Acin). And indeed, Stefano showed us that it is possible to go further: he has a protocol which generates ${n^{1+c}}$ randomn bits, for some small ${c}$, using ${n}$ ebits.

The last question, on which we unfortunately did not have time to go into, is of obtaining guarantees on the randomness that hold against “post-quantum” adversaries, limited only by the no-signaling principle. This is a fascinating problem on which very little is known. There are no-go results for privacy amplification in this setting, but they only apply in rather stringent scenario; moreover they pose no immediate obstacle to guaranteeing entropy, rather than uniform randomness. The problem is challenging because the general non-signaling framework gives us very little to work with: a family of conditional distributions, but no such thing as a state, a measurement, a post-measurement state, etc.; very few tools are available. On the other hand it makes the problem mathematically very clean, almost “classical” (no more Hilbert spaces, no more quantum!). I only wish I had more intuition for it!

Acin –  How much randomness can be certified in quantum and general non-signalling theories?

In the last talk of the workshop Antonio Acin asked the question — “How much randomness can be certified in quantum and general non-signaling theories?” — and gave us some rather counter-intuitive answers. As already mentioned in Stefano’s talk, the prevailing intuition is that the amount of randomness that can be extracted is directly related to the amount of “quantumness” present in the devices, where “quantumness” would be directly related to entanglement, since entanglement is clearly the “non-classical” feature exploited here. But this need not be so: even though entanglement is necessary for certifying the device, it is not a priori required that it serve as a resource that gets depleted as more and more randomness gets produced.

As a first result Toni showed how more randomness could be extracted from a single measurement, on a single entangled EPR pair, by considering general POVM measurements instead of projective ones. It is known (and not hard to show) that making POVM measurements on an EPR pair can only yield at most one bit of randomness per site. Toni introduced a four-outcome Bell inequality such that a maximal violation of the inequality requires the use of a non-projective four-outcome measurement on one half of the EPR pair, and a state such that each of the four outcomes has equal probability ${1/4}$ when the measurement is performed — thus yielding two uniformly random bits from a single qubit!

This is unexpected, but perhaps not groundbreaking: one bit, two bits, ok. But then Toni dramatically scaled things up. His second result is that by performing repeated measurements on a single, constant-size entangled state (I think in his scheme this is again just a pair of qubits) one could generate an infinite sequence of random bits, containing an amount of entropy which certifiably goes to infinity — provided the requisite Bell violation is guaranteed.

If you’re wondering about the relationship between this result and the one presented in the talk by Henry Yuen on infinite randomness expansion, there are some important differences. First of all Toni’s result applies under the assumption that the device maximally violates a certain Bell inequality, but there are no bounds on the accuracy required: thus in principle testing for a sufficiently-close-to-optimal violation could require an arbitrarily large number of runs; in Henry’s scheme the testing is fully accounted for. On the other hand Henry’s scheme requires the devices to have an unbounded number of EPR pairs available, whereas the main point of Toni’s scheme is that a single pair can be re-used over and over again.

The result is a proof of concept (another limitation is that the Bell inequality requires to perform a measurement on one of the devices that has an exponential, in the number of random bits produced, number of possible settings — the other device being measured sequentially using measurements with two settings), but I again find it rather unexpected. The whole area of device independence is turning out to be a great testbed for many of our (bad) intuitions about quantum mechanics and nonlocality!

Toni ended his talk with a third intriguing result, relating algorithmic randomness to information-theoretic randomness as we usually understand it. The main observation is the following. Suppose a classical Turing machine is used to prepare a sequence of qubits where each qubit is either initialized in one of the two possible computational basis states, or each qubit is initialized in one of the two possible basis states in the conjugate basis, the Hadamard basis. Then there is an algorithmic procedure that always succeeds in distinguishing which is the case! Note that the Turing machine may very well be producing a sequence that for all practical matters appears fully random, so that one can for instance show that no procedure which makes a single choice of basis and measures all qubits in that basis will be able to tell if it got the right basis or not. Instead the trick is to use both bases: measure half the qubits in the computational basis, and half in the Hadamard basis. In this case, one of the sequences obtained will be uniformly random for sure, and the other will be the sequence produced by the Turing machine — and as it turns out, it is possible to distinguish the two algorithmically. See the paper for more details, and further results along those lines.

The day ended with a very brief “open problems session”: Toni wrote down some of the questions that were raised during the workshop and we all commit to solve by the next iteration. I’ll leave you with a picture of the board. There are certainly many other questions we could ask, some of which are interspersed in these blog posts; feel free to suggest more!