Science funding


Hard at work in Puerto Rico

Barely back from a workshop organized in Puerto Rico by Ryan O’Donnell, Krzysztof Oleszkiewicz and Elchanan Mossel, and lavishly funded by the Simons foundation, I am greeted on my BART ride home (thanks wifirail) by a NYTimes piece Billionaires With Big Ideas Are Privatizing American Science, featuring Jim Simons among many others. I had planned to blog about the excellent workshop, but this task has already been pre-empted by Li-Yang Tan’s comprehensive posts over at Ryan O’Donnel’s blog. So instead I will pick up on the NYTimes piece, which although lacking in verifiable facts and hard data raises some interesting issues. Being entirely funded by the Simons foundation through the Simons Institute in Berkeley (though I should say I don’t know how much of my salary comes directly from the foundation, and how much is contributed by UC Berkeley or other sources), whether I like it or not I am an integral part of the debate and find it worthwhile to give it some thought.

I fact, I do take to heart the question of where my funding comes from. Perhaps this is due to the somewhat unusual circumstances of the first funding I ever received, from the French government. I was fortunate to undertake my undergraduate studies at École Normale Supérieure in Paris, one of France’s (in)famous “Grandes Écoles”. One of the perks of ENS is that all students receive a modest monthly salary, the result of us being made “civil servants” at our entrance into the school. As a foreign national (I am Belgian) I found this “unreserved confidence” granted me by the French state (a job for life solely based on my results in some math exam!) somewhat humbling. Paid to study! Did they really think I would actually produce something useful, or was the French government just living through its characteristic largesse? (Obviously, the latter :-)) The funding did come with some conditions: in exchange for it I promised 10 years of work to the government (this includes working as a teacher, or researcher, for any of the French public organizations). I still owe them six years (the four years undergraduate count towards the total), but, shhhhht, so far they haven’t bothered to come after me…

Since those years I have been continuously supported by public funds, mostly from the French, Canadian, Singaporean and American governments, under what I can only qualify as wildly un-restrictive conditions. What justifies such unreserved trust? Behind the governments are taxpayers, some of whom voted for, others against, devoting a small percentage of their income to supporting researchers like myself. Although it may sound a little naïve I am proud of being awarded such confidence, and believe it is important to understand why and for what purpose it was given, and to do my best to deliver on this purpose. (Note here I am taking “purpose” in a strongly ideal sense, setting aside the cynical politics of science funding and instead focusing on what I perceive as the “right” motivations for funding science.) I owe a lot to public funding, and aside from the Simons foundation this semester I have not yet received much private support, although this may change in the future.

Back to the NYTimes piece. Are big private donors’ efforts to “leave their mark” by directing huge amounts of money into research on one particular area with large potential  impact — curing a specific disease, saving a species, flying civilians into space, etc — a threat to the “purity” and “independence” of scientific research? Are such efforts anything more than a spoiled child’s whim? As a scientist on which relatively small but nevertheless impressive amounts of one such donor’s money have recently been expended with virtually no direct oversight (witness the Puerto Rican resort for last week’s workshop, that I won’t dare link to) it would perhaps be disingenuous to be dismissive of the enterprise. But still… what do they know about fundamental research? What makes them qualified to influence, the future of science? The billionaires cited in the NYTimes article have made their fortunes in a variety of industries, software, oil, housing, pharmaceuticals, investment. Aren’t they doing more harm than good in throwing their checks around, paying others to achieve feats they will boast about at cocktail parties? Should we, scientists armed with the purest ideals, fall prey to the billionaire’s fancy?

I don’t think it’s that simple. The NYTimes piece presents some of the drawbacks of private funding, such as skewing research toward trendy areas or diminishing public support for federal science funding. These are real risks and identifying them is the first step in limiting their effect. Potential benefits, however, may far outweigh these risks. I am going to take a very idealistic (who said provocative?) perspective: private philanthropy at its best is the way science (and more generally the arts) should be funded. What is the purpose of science? It is a means to advance human knowledge. (There are others, such as philosophy or the arts, that are just as important or successful, but here I will focus just on science.) Given such a grand quest facing humanity’s finite resources, the important questions are: which directions, which challenges, should be privileged, and who should be elected to tackle them? Perhaps just as important, who should be empowered to decide on the answer to those questions: an all-powerful government (and its bureaucrats) or the scientists themselves?

Jim Simons

Neither. The decision-making process of any big government is unfathomable; there is no reason to think most bureaucrats are particularly qualified to make the funding decisions their appointment entails them to. Paradoxically their job might even make them less qualified than average at making those decisions (here I am thinking about high-level decisions on research directions or paradigms to pursue, rather than low-level decisions as to which particular grant applications should be funded). Indeed it is hard to imagine that even the best-intentioned employee of any big governmental agency would not eventually fall prey to a strong bias in favor of a very specific kind of proposal — say, one that emphasizes “impact”, or “public outreach”, or whatever is the agency’s current doctrine. Scientists suffer from a similar bias, though it is generated by an antipodal situation: they are too qualified to be the ones making decisions as to the future of science. Being right at the heart of the scientific process itself, and in spite of our grand claims to the opposite, our perspective is inevitably near-sighted. We are strongly biased in favor of directions and ideas that, although they may appear very bold, are rooted in the current techniques and scientific understanding and less prone to the kind of bold, wild projects that only the more “ignorant” could make.

Instead, I would like to argue that the best-placed to make paradigmatic decisions about fundamental research are strong-willed individuals, outsiders, who have a high but non-vital stake in its outcome. Note how this brings us close to, but not quite at, the claim that science funding should be directly decided by popular vote. I don’t place much trust in democratic voting; it is too subject to manipulation and places too important a stake in the hands of the many who simply do not care, or are easily led to care for the wrong reasons. Hence the insistence on “individuals” — decisions should be made by a few responsible persons rather than an amorphous group. I further strongly believe decisions should be taken by people who care about them; there is no reason to make those who don’t care a part of the process. I insist on “outsiders” simply because they are the only ones really free to make the kind of non-obvious choices necessary for scientific innovation. Finally, by “having a stake in the outcome” I mean that self-appointed decision-makers should be allowed only if they have a strong personal incentive in taking the matter seriously; no “strong-willed dilettantes” please!

So, private donors anyone? There are some clear risks, many of them exposed in the NYTimes article (and a Nature editorial linked to from the article), and I am certainly not advocating for a take-over of the whole decision process by private individuals. Private donors at first sound particularly undesirable as they may have a very specific kind of stake in the process, emphasizing “return on investment”, as well as a particularly directive approach more appropriate in a business-like environment. It is nevertheless interesting that the NYTimes piece makes it clear private donors often have very personal motivation for investing in a particular kind of science, either because they (or their close relatives) suffer from a certain rare disease, or they have a lifelong dream of achieving a particularly challenging technical feat (e.g. flying into space), or maybe had a life-changing personal encounter with this or that scientist, etc. These are all great reasons to promote a particular kind of science. In general I disagree with the image of scientific progress as an objective path towards a greater good. Science has always been rooted in a particular social and political context; there is no point in uprooting it, quite the opposite. Giving some amount of decision-making power back to individuals who care may be one of the best ways to ensure science accomplishes what it’s best at, helping human beings discover themselves and their world.

I think some amount of private decision-making (rather than just funding) may also help us, scientists, stay on our toes and remember why we are here, why we are given such trust; trust should never be unconditional. I don’t think it is doing us a favor to be awarded the comfort of an almost-guaranteed support in the form of “small grants” for almost whatever research proposal we may come up with. Instead I believe we should be ready to fight for, and defend, our ideas and research proposals. Comfort is never helpful. If this means having to pitch our ideas to individuals, then so be it. At least they actually take the issues at heart; much better than a double-blind interaction with some anonymous committee. As for scientific freedom and integrity, I trust us scientists to continue making it clear that we ultimately are the ones in charge; as long as we keep our reputation up — admittedly a perpetual fight — I don’t see the balance tipping in the wrong direction.

Alright, so perhaps this was a little provocative :-) What do you think?

Posted in Science, Simons | Tagged , , , | 6 Comments

Workshop on “Quantum games and Protocols” at the Simons

** Update: videos are now up! **

This past week saw the first of three week-long workshop that will rhythm the semester on quantum Hamiltonian complexity at the Simons institute (I’ll count the boot camp was the zeroth-such workshop). Dorit Aharonov, John Watrous and myself co-organized the workshop around the theme of “quantum games and protocols”. The general idea was to investigate the strengths and limitations of quantum machines — depending on the context, quantum computers, quantum provers, quantum adversaries, etc. — in interactive scenario, both quantum and classical. This included questions such as the possibility for delegating quantum computations through classical means, the problem of verifying quantum mechanics, the power of multi-prover interactive proofs with entangled provers, the quantum PCP theorem, two-party cryptography, device-independence, and more!


The Simons auditorium (courtesy D. Nagaj)

We were extremely fortunate to have an absolutely excellent line-up of speakers for the workshop. Each of them put a huge amount of effort into preparing talks that would be accessible to the relatively broad audience — complexity theorists, experimentalists, mathematicians, cryptographers — while still managing to distill and communicate the “expert’s nugget”. As all talks given in the Simons’ main auditorium they were expertly recorded, and I can only highly recommend watching the videos. Here is a partial set of highlights from the workshop: (Right now the links only point to each talk’s abstract, where the recorded videos will appear in about a week; in the meantime the talks can be viewed by navigating to the correct date and time, inferred from the workshop schedule. on the Simons’ ustream channel.)

Umesh Vazirani gave a very nice, focused introduction to the problem of “testing for “quantumness””. Many explanations are often put forward to justify the power of quantum computing: exponential-sized vector spaces, complex amplitudes, the measurement rule, etc. Umesh argued that none of these ingredients by itself is sufficient to distinguish quantum states from probability distributions; rather he pointed to the subtle combination of entanglement, the observers’ freedom in choosing a measurement basis, and the quadratic nature of the measurement rule — {\sin^2\theta} vs {\theta} — as the “magic combination” that underlies the irreducible “quantumness” of our world. Umesh then broadened up significantly by outlining some of the far-reaching implications of this simple difference: by allowing for the rigid testing of quantum states and measurements through the CHSH game this particular form of “quantumness” leads to a host of applications, from randomness certification to device-independent key distribution and delegated computation.

Daniel Nagaj delivered a fast-paced and beautifully illustrated talk on the diverse clock constructions used in proofs of QMA-hardness results for restricted classes of local Hamiltonians. Daniel argued that all known (and foreseeable) constructions of clocks necessarily have a spectral gap that closes as {O(L^{-2})}, where {L} is the number of gates in the encoded computation. Although the exact scaling could possibly be fine-tuned for each specific construction, it places a strong limitation on any attempt at constructing a local Hamiltonian that would satisfy the quantum PCP desiderata. I found Daniel’s talk very helpful in clarifying why known QMA-hard instances have a closing promise gap. The current reductions all go through the analysis of the low-lying eigenspaces of sums of local Hamiltonians, where we often think of nagajone of the Hamiltonians as a small perturbation of the other. The resolution to which this analysis can be done depends on the spectral gap of the larger Hamiltonian; the smaller this spectral gap the smaller the promise gap one will be able to maintain. Since the Hamiltonians that are manipulated, such as the clock Hamiltonian, tend to have vanishing spectral gaps, this is reflected in the final promise gap (watch Daniel’s talk — in slow motion — for more details!). As we know, large spectral gaps tend to impose strong structure, and this appears to be a significant difficulty on the way to quantum PCP.

Fernando Brandao described a striking implication of his recent de Finetti theorem with Aram Harrow. Their result essentially implies that “naïve gap amplification” is impossible in the quantum setting, in the following sense: any transformation of a local Hamiltonian that simultaneously at least (i) squares the degree of the constraint graph, and (ii) squares the local dimension of the qudits, must necessarily reduce the (relative) promise gap. This is highly counter-intuitive because classically, the operation which takes a constraint satisfaction problem and maps it to its “square” through parallel repetition achieves (i) and (ii) and furthermore has the effect of increasing the promise gap by a factor roughly {2}. Indeed, this is exactly the goal of parallel reptition; the Brandao-Harrow result shows that this most direct method of amplifying the gap will not work in the quantum setting.

On Tuesday morning were two talks exploring the hardness of multi-prover interactive proof systems. The first talk, by Dana Moshkovitz, presented recent progress on the “sliding scale conjecture”, which aims to devise PCPs with optimal trade-off between the verifier’s randomness, the alphabet size, and the soundness. Dana hasn’t resolved the conjecture (yet), but she presented strong progress by describing a low-degree test with very good soundness parameters. The test is obtained through the parallel repetition of existing low-degree tests, but the analysis goes through a new way to think about the resulting test, as a “curve vs curve” (instead of “line vs line”) test, where the curves intersect in multiple points. The second talk was given by by Zhengfeng Ji. Zhengfeng presented a comprehensive overview of the few hardness results that are known for the case of entangled provers, including a nice description of the key techniques used in the proofs. These include the design and analysis of various small “sub-games”, such as the consistency game, the confusion game, etc., which I hope will find further applications (see also a more technical whiteboard talk I gave on this topic at the boot camp). Zhengfeng’s talk was followed by a lively discussion on the relationship between inequivalent quantum versions of the classical PCP theorem (video link to come?).

Carl Miller and Henry Yuen presented exciting recent works on device-independent randomness certification and randomness amplification. Carl described how a very recent result joint with Yaoyun Shi could be understood as a way to constrain a device to perform anti-commuting measurements, thereby generating randomness irrespective of the underlying state. Henry showed that existing randomness expansion and certification protocols could be composed with themselves in a way that allows for unbounded randomness expansion — the procedure can run indefinitely, with the same constant number of devices, while producing infinitely increasing quantities of trusted random bits!

Ran Raz gave a talk on recent join work with Yael Kalai and Ron Rothblum that I really like. They reduce the task of devising a computationally secure one-round delegated computation scheme for languages in time {t(n)} to that of creating a multi-prover interactive proof system for the same class of languages that has the strong property of being sound against non-signaling provers. This reduction is based on previous work, and the meat of their new result lies in devising such a MIP for all languages in EXPTIME, such that the MIP involes polynomially many provers running in exponential time, while the verifier itself runs in polynomial time. Their construction is very involved and combines a number of new ideas to control the soundness error in the protocol; perhaps some of these ideas could be helpful to analyze entangled-prover interactive proof systems.

David Perez-Garcia gave a very clear talk on entanglement in many-body systems; more specifically on the problem of classifying phases of matter and how the discovery of phases davidwith topological order provides new motivation, as well as new difficulties, for this task. David’s talk gave the best explanation I had ever heard of the problem (of course I am quite new to this), and I highly recommend watching it to everyone interested in getting a smooth introduction to one of the hottest topics in many-body physics.

Thursday morning saw a wonderful trio of talks. The first, by Pablo Parrilo, presented semidefinite hierarchies and their use in quantum information (more specifically, separability testing and the quantum moment problem). Pablo gave a very clear picture of both primal (through sums of squares) and dual (moments) views of these hierarchies. Next Aram Harrow talked about ways to quantify, and use, the monogamy of general non-signalling correlations. He presented a few variants of a recent de Finetti result for such correlations that he obtained with Fernando Brandao. Aram also gave a range of applications for the de Finetti results (for instance, to proving hardness results for free games with entangled players) together with a host of open questions. In the last talk of the morning Patrick Hayden gave us an enjoyable tour of the “fundamental laws of quantum entanglement”. One such law is that entanglement can never be created out of thin air. However, in Patrick’s words, it can sometimes be surreptitiously “stolen”, by using embezzling quantum states. Those are special “catalyst” entangled states from which any other entangled state can be generated by local operations alone (up to precision), while leaving the catalyst essentially unchanged…a fascinating topic!

Iordanis Kerenidis gave an hour-long marathon reporting on four years of (team)work (now available on arXiv!) in dissecting, understanding and improving Carlos Mochon’s breakthrough showing the existence of quantum protocols for weak coin flipping with arbitrarily small bias. I found Iordanis’ talk very inspiring — as I told him, the first half of the talk really made me want to work on coin-flipping. It is quite a basic primitive, but besides QKD one of the rare fundamental such primitives for which there is a strong quantum advantage. And its analysis gives rise to such a beautiful, and apparently powerful, formalism for the study of quantum interactions (point games anyone?) that one feels one ought to lay one’s hand on it. The second half of Iordanis’ talk, however, involuntarily served as strong caution from someone who has invested years of effort on the topic: although big rewards may be awaiting discovery, do not expect them to come easily! Indeed, Iordanis described some of the more technical (and subtle) aspects of the SDP–point game correspondence, and it is not for the faint-hearted…check out the talk!

These are only a few of the many enjoyable talks that rhythmed our week — others included two presentations by Ben Reichardt on his powerful rigidity results for the CHSH game and their use in delegated computation; Joe Fitzimon’s talk on measurement-based delegated quantum computation; talks by Philip Walther and Eleni Diamanti reporting on challenges in experimental implementation of quantum cryptographic protocols; a survey of quantum zero-knowledge proofs and quantum proofs of knowledge by Dominique Unruh, who emphasized the key role played by quantum rewinding techniques; a beautiful talk by Amit Sahai on his breakthrough result on program obfuscation; a follow-up talk by Paul Christiano describing a tentative protocol for quantum obfuscation of classical circuits; a playful but dense introduction to a very fruitful connection between the theory of quantum games and tensor norms in operator spaces by Marius Junge; a survey on the problem ofparallel repetition and direct product testing by Irit Dinur, including her recent work with David Steurer proposing a new, algebraic (they call it analytical) framework for proving such results; two back-to-back presentations on the first comprehensive entangled-prover parallel repetition results by André Chailloux and Attila Pereszlényi; a talk by Jamie Sikora presenting an improved understanding of a certain class of coin-flipping protocols; a beautiful talk by Zeph Landau giving a a tensor-network view of coin-flipping protocols (directly inspired from the Gutoski-Watrous framework for the description of two-player quantum games) that leads to a simple proof of Kitaev’s lower bound, and, last but certainly not least, a presentation by Scott Aaronson on his work with Alex Arkhipov promoting Boson Sampling as an experimentally realizable, and verifiable, demonstration of “quantum supremacy”.

All in all this turned out to be a very intense, but also very gratifying, week. As an organizer I feel like we were extremely lucky to have not only an excellent line-up of speakers but also a formidable bunch of enthusiastic participants. The whole workshop was very interactive, with lots of interruptions, questions, and small-group discussions taking place during and after the talks. The building in which the Simons institute i hosted made this possible by providing us with a beautiful auditorium, including windows letting the natural light in and a fairly flat structure (only five rows) that encourages participation, as well as ample (and pleasant) discussion space in the upper floors. If anything I wish we had managed to keep a slightly lighter schedule, to allow even more time for discussion. Aall the talks were attractive and well-attended, and by Friday afternoon we were all pretty tired…I was very impressed that many participants still had the energy for the lively open problems session that ended the week on a high note. Once again, a very heartfelt thanks to all the participants; I am looking forward to the next two workshops of what has so far been quite an exciting semester.

Posted in Conferences, Simons | Tagged , , , , | 2 Comments

Simons bootcamps

I guess it wasn’t that bad…

Each semester the newly-founded Simons Institute, hosted on the UC Berkeley campus, simultaneously hosts two thematic programs. The first two programs, which took place in the Fall, were onTheoretical Foundations of Big Data Analysis and Real Analysis in Computer Science. This Spring two new programs just got started, on Quantum Hamiltonian Complexity and Evolutionary Biology and the Theory of Computing (I’ll let you guess which one I’m participating in!)

You’ll notice that both programs are strictly in line with the Institute’s stated goal of “expand[ing] the horizons of [computer science] by exploring other scientific disciplines through a computational lens”. But there is a subtle difference in their titles, which was partially reflected in their respective “boot camps”, or introductory workshops, held over the past couple weeks. While the quantum program has a self-assuming title, the other program’s resonates closer to an explicitly forced conjunction, “biology and computing”. Hamiltonian complexity is a fast maturing field, initiated in great part by Alexei Kitaev in the early 2000s. “Computational evolutionary theory”, in contrast, is a field in wanting. One of the goals of the program on evolution will likely be to set the foundations, and perhaps more importantly the expectations, for this field; laying out its ground questions. (Or, as Christos put it in answer to a question about which open problems he would conceivably think could be solved during the semester, perhaps the program’s most ambitious goal is simply to “forge life-long ties”.)

If one is to judge by the talks given in the boot camp this is likely to be an exciting but challenging task. The idea for these boot camps is to bring all participants “on the same page”. I won’t comment much on the QHC bootcamp, although I’d like to come back to it in a separate post in which I will elaborate a little on a talk I gave about hardness results for quantum multiplayer games. For now I just want to point to the talk given by Christos Papadimitriou in the Evolutionary Biology boot camp. I highly recommend watching both hour-long talks, including (especially) the Q&A that took place throughout, and at the end, of each of the talks. I always enjoy the density of Christos’ talks: on the surface they may appear quite light, and the material covered bounded. But Christos always manages to raise an impressive number of key conceptual (and often controversial!) issues using few but well-chosen words.

About 27 minutes into his first talk Christos raises the following question: if protein folding is an NP-hard problem (as it is in general), then how come proteins fold by themselves? Apparently this is known as Levinthal’s paradox and has been discussed since the late 60s. The reason I am bringing it up here is the question sounds oddly familiar. It is the same question that was discussed at length in the QHC bootcamp, and also on this blog: if finding ground states of local Hamiltonians is a QMA-hard problem, then how can physical systems ever reach equilibrium? Is nature solving a QMA-hard problem? If so, how?

It was interesting to see how the same fundamental question on the physical bearing of results in computational complexity arose in both programs. The audience in Christos’ talk immediately raised some of the same objections that are often discussed regarding the QMA-hardness of the local Hamiltonian problem (and the possible physical bearings of the quantum PCP conjecture), and more:

  • What if nature doesn’t find the true optimum? Do we know that the actual protein foldings that occur are optimal? How could we if it’s a hard problem? (Quantum version: do complex systems really thermalize?)
  • NP-hardness is a theory of the worst-case; maybe the hard instances are not physical. In Christos’ words:“Life is hard but natural selection can favor easy inputs”. (Quantum version: hard instances of LH may not be physical; for instance the interaction graph may not be geometrically local.)
  • Maybe the proteins that occur in nature have been selected precisely for their ability to fold easily? Could evolution be such that the only surviving proteins are those that can find their optimum folding configuration by some weird indirect method? (Quantum version: what if the only quantum systems stable enough that we can study them are precisely those for which finding the ground state is an “easy” problem?)
  • NP-hardness is an asymptotic theory. It doesn’t forbid the existence of exponential-time algorithms with very good constants. Maybe Nature is a skilled optimizer.
  • Perhaps the needed advice was planted at Creation?

There are nuch more, and once again I encourage you to watch & enjoy the talk.

Posted in Conferences, Simons | Tagged , , , | Leave a comment

Quid ITCS?

Earlier this week I attended the fifth edition of the ITCS (“Innovations in Theoretical Computer Science”) conference, which took place over the course of three days in a beautiful, stately auditorium inside the Field center on the Princeton University campus. This was my first time attending the conference and, given the noise that surrounded the series’ launch just a few years ago, I was quite eager to finally experience it first-hand.

ITCS was created with great fanfare just a few years ago: initially named ICS, the first two installments, in 2010 and 2011, were held at Beijing’s famous Tsinghua university; one of their noted features was that an author of each accepted paper got a free ticket to the conference, including airfare, hotel, etc. The first ICS had an ambitious call for papers, emphasizing “research that carries a strong conceptual message (e.g., introducing a new concept or model, opening a new line of inquiry within traditional or crossdisciplinary areas, or introducing novel techniques or novel applications of known techniques)”, together with a substantial steering committee to back it up. Much was made at the time of ICS’s purposefully distinctive focus (as opposed to STOC, FOCS, or other TCS conferences sometimes decried as “too technical” or “narrow-minded”), and I remember heated debates on the blogosphere (see e.g. herehere or here for a sample and further links). Given such high expectations, I was curious: had the conference lived up to its promise?


A digression on conflicting publication venues

The reason for my attendance this year could be taken as a positive sign that it had. While the two papers I presented at the conference fall squarely within the realm of TCS, my co-authors and I strongly felt that the results should be publicized to a wider audience as well (in our case to theoretical physicists). ITCS, as opposed to STOC or FOCS, allowed for this possibility. Before explaining this, let me briefly describe the papers in a way that I hope makes it clear why we think it worthwhile to make the results accessible, not only to the TCS community, but also to researchers in physics:

The first paper provides the first complete proof of security for a cryptographic protocol for key distribution having the distinctive property that, although the protocol relies on quantum mechanics to achieve unconditional security (this would be impossible classically), the proof of security holds even in a setting where all the quantum devices used in the protocol are untrusted (they could even have been designed by the adversary!). The proof is based on ideas from computer science (such as randomness extractors or techniques borrowed from the analysis of parallel repetition of multiplayer games), but the end result has a strong bearing on our understanding of a phenomenon known as monogamy entanglement. The latter has received a lot of interest recently, and plays a role in areas ranging from the study of phase transitions in condensed-matter physics to the theory of black holes.

The second paper introduces a polynomial-time algorithm for a problem of interest in many-body physics: computing the ground state of a 1D gapped local Hamiltonian. The algorithm builds upon widely used techniques in computer science (dynamic programming; convex optimization and semidefinite programming; matrix-valued concentration bounds) but the end result, beyond its (probably impractical) use as an algorithm for the simulation of physical systems, demonstrates important properties of gapped systems that go beyond the celebrated “area law” that they have been shown to exhibit.

My co-authors and I thus strongly felt that, although the results should be published in a TCS venue (we are computer scientists, and we — though I should really only speak for myself — view these results as results in, made possible by, and of interest to computer science), they should also be publicized to the appropriate audience in physics. How could this be done? In computer science we are used to publishing an initial version of our results in conference proceedings, with the infamous “full version” eventually submitted to a journal (I will not get into the debate of whether this traditional distinction is still relevant…see this post by Sanjeev Arora for a recent discussion). In physics, papers are published directly in a journal; conferences are reserved for short, often invited, presentations, and do not have published proceedings. These differences lead to differences in copyright and priority handling: TCS conferences strongly insist that the submitted material has not appeared (nor even been submitted) elsewhere, while TCS journals are willing to publish minor updates on conference papers. In contrast, physics journals tend to have extremely restrictive copyright policies. To make a long story short (see this post by Umesh Vazirani for more), we were stuck: publication in a TCS conference is all but incompatible with publication in a Physics journal (such as Science or Nature), irrespective of how you do it (believe me, we tried!). Unless…one would treat the TCS conference as what it is, a conference, and not a publication, by submitting (and presenting) the full paper, but only publishing a very short one-page abstract in the proceedings (pointing to a full version of the paper from the arXiv). While STOC and FOCS do not currently allow this, ITCS has been explicitly permitting the publication of very short abstracts since last year. So we submitted to ITCS…and, even though the process of unentangling ACM’s copyright policies was by no means straightforward, we eventually found a workable solution — and I got to attend the conference!


Impressions from this year’s ITCS

1. The format

The first thing that struck me, as a presenter, is that the program seemed quite dense: there were four 90-minute sessions per day, each consisting of four 20-minute talks together with a 10-minute introduction by the session chair. This meant, first, that the schedule went on from 8:30am till 5:30pm each of the three days (including a welcome 2-hour lunch break), and second, that each talk was allotted 20 minutes, including questions and switching of speakers. In other words, I better prepare for a 15-minute talk — and still somehow manage to carry the “conceptual message” across!

Why such a crammed schedule? The CFP for the first ICS explicitly stated that “the total number of papers will be around 30, so as to allow for ample time for open and one-on-one discussions and exchanges of ideas.” In fact, the first PC barely kept in line with this requirement, as 39 papers were accepted to the conference. Since then the sentence has disappeared from the CFP, and this year there were 48 accepted papers (roughly the same count as last year). Although this only represents an additional 10 papers, it’s 20 more than the initial CFP called for. (Admittedly, the original ICS CFP also called for a conference spread over two and a half days, a requirement which seems never to have been implemented.) For comparison, STOC or FOCS typically accept around 70-80 papers, and have two tracks; ITCS has a single track. So the average load is similar — but STOC or FOCS don’t have an explicitly stated goal of “allowing for ample time for open and one-on-one discussions and exchanges of ideas”.

The space inside the Fields institute

In spite of the overloaded schedule, the fact that there was a single track did give ITCS a more relaxed feel than multiple-track conferences. It’s hard to pin down why exactly, but the difference is significant. Maybe it’s simply that it’s so much more pleasant to be able to simply walk into the designated conference room trusting (hoping) that the PC has done a good job and one is about to hear interesting work. No worrying about what is going on in the other room, and should I switch, and risk being caught up in one of the incessant hallway discussions — discussions which are certainly an integral part of the conference, but maybe there can be time for both! Even though total participation was fairly low (around 80 registrants or so), the talks were well-attended. The long coffee and lunch breaks allowed for ample discussion time; it helped they were held in a very nice 3-4 loosely divided rooms (together with a smallish “library”) with chairs, couches, and tables organized in various configurations which made it possible to comfortably accommodate all kinds of conversations and group sizes. Compared to the generic STOC ballroom-, or worse, corridor-style break area this was a big plus of the conference.

Another distinctive feature of ITCS are the 10-minute introductions given by session chairs at the beginning of each four-paper session. The chairs allocated those 10 minutes in more or less creative ways: while some simply distributed their minutes to the speakers, allowing a little extra (much needed) time for questions, others attempted a more in-depth introduction, including slides, to place the session’s papers in context. Some of these presentations were good, but not all were successful. It is not always easy to warm up the audience to yet another topic in such short time; being in the audience I also felt some impatience at hearing the results themselves, and found it difficult to not let my mind wander during the introduction. Maybe those could be kept shorter, to just 2-3 minutes for presenting the topic of the session; instead of a lengthy introduction it might be more beneficial if the chair placed his or her efforts into coming up with good questions to stimulate discussion after the presentations. (I’m not blaming the session chairs…some of the introductions were quite helpful; it’s not an easy task.)

2. The talks

Now, this is all good, but what about the talks themselves? ITCS’s main claim to fame lies in the following two sentences of its CFP:

“ITCS (previously known as ICS) seeks to promote research that carries a strong conceptual message (e.g., introducing a new concept or model, opening a new line of inquiry within traditional or cross-interdisciplinary areas, or introducing new techniques or new applications of known techniques). ITCS welcomes all submissions, whether aligned with current theory of computation research directions or deviating from them.”

The above statement is the only part of the CFP that relates to the content of submissions (as opposed to formatting guidelines). Since the second sentence has essentially zero content (either {x} or {\neg x}…), we are left hanging on the first. I wonder if the statement is purposefully elliptic; note that it does not even state what is desired of submissions, but only what the conference “seeks” to achieve. If this happened to be the first time I encountered ITCS I could be forgiven for being puzzled at the conference’s scope, and for being hesitant as to whether I should submit my results there. In contrast, the CFP for this year’s STOC clearly states “submissions presenting original research on the theory of computation are sought. Topics of interest include but are not limited to:”, followed by a long list of topics.

Now for the real question: did the program of this year’s ITCS achieve its stated goal of “promoting research that carries a strong conceptual message”? I’ll let you have a look at the program and judge for yourself. I have mixed feelings. On the positive side, a solid fraction of the results did have a twist to them that is less often seen in STOC/FOCS papers. Some papers, such as “Cryptogenography” by Brody et al. or “Building one-time memories from isolated qubits” by Liu, introduced a new model (the setting for the former paper is that one among a group of cooperating players holds a secret bit, and the goal is for the group to openly communicate in a way that any eavesdropper can learn the secret bit, but not the identity of its holder; the latter paper introduces a model of security in which the eavesdropper is not allowed to make entangling operations — a requirement presumably enforced physically by choosing an implementation for the OTM scheme that makes such operations highly impractical). Others, such as “Non-commutative arithmetic circuits with division” by Hrubes and Wigderson, obtained interesting results by studying an unconventional extension of a well-studied problem; such “outside-the-box” approaches can help put current efforts in perspective. Yet others suggested new approaches to hard problems; for instance “Tree Codes and a Conjecture on Exponential Sums” by Moore and Schulman suggests a path towards solving an important open question in interactive coding, but provides no “concrete” results. There was even a session devoted to results with application to, or motivated by, the physical sciences, including “Attribute-Efficient Evolvability of Sparse Linear Functions” by Angelino and Kanade, or “Rate-independent computation in mass-action chemical reaction networks” by Chen et al.

These are all results that I believe would easily fit in STOC or FOCS in terms of quality, but may not eventually make it because of the lack of, either concrete technical progress on a well-known problem deemed important by the community (such as some of the papers introducing novel frameworks whose long-lasting impact is hard to measure), or direct relevance to a well-circumscribed area (such as the papers inspired by the physical sciences). These are the papers that, if one is to follow the CFP, should make the meat of ITCS. The corresponding talks were among the best; it was sometimes (though not always) apparent the speaker knew very well he or she had to “defend her turf”, i.e. find the correct words for promoting the conceptual message of the paper. This led to interesting, fresh talks, and lively discussion sessions.

Those only accounted for about half the papers (by my count…) presented at the conference though. The other half struck me as consisting of much more conventional results, which would more naturally find their place at STOC, FOCS or SODA. I don’t want to point to any specific papers, and in any case I have nothing against them; most were very good results anyways. They would easily fit in any top TCS conference…any but ITCS? I was not on the PC, and unfortunately there was no (publicly announced) business meeting, so I don’t know how the requirement for a “conceptual message” was taken into account when making acceptance decisions; it would be interesting to have some inside information about this. Was there a target number of accepted papers, and if so what was it — did it get exceeded? How much weight was conceptual novelty given in the decisions? If it was taken into account, what was the rationale behind accepting this “second batch” of papers? Do they have some novelty that I simply am not seeing? The point here is not to criticize the work of the PC or the papers themselves, but to pose to question of “how far” ITCS is willing to go in sticking with its stated goals. Was there a compromise, and if so why? Could the program not be made much shorter, while still keeping a high-quality conference, that many would willingly attend (and perhaps even more so than are induced by accepting yet more papers)?


Why do we go to conferences?

August 23, 2009

Thinking through my experience with ITCS, the question I am brought back to is a most basic one: why do I go to conferences? What do I expect out of them? After all, it takes quite some effort to attend: the travel, the costs, etc. Here are some seemingly obvious answers: 1. to hear about exciting new research in TCS that I may not have heard about otherwise; 2. to have the ideas driving this research explained to me in a clear and insightful way by select speakers; 3. to catch up with the remainder of the community over pleasant social events such as a dinner, coffee breaks, rump session, etc.; and 4. as a result of 1.-3., to emerge from the conference with i) the refreshing feeling of belonging to a dynamic research group that is constantly re-inventing itself, and ii) research ideas!

What is unfortunately much less obvious is whether any of these goals are being kept in mind when shaping our conferences; for that matter it is unclear, at least to me, what their actual goals are (if any!). This is at least one merit of the ITCS CFP: it states an explicit goal for the conference, “ITCS seeks to…”. I’ll end this post by describing some thoughts, or suggestions, as to how one could make progress towards achieving the four goals stated above. For the fun of it I’ll express my suggestions in the present tense, as if I was describing an already existing “dream” conference which I’ll call “XTCS”.

1. Single track.

One of the distinctive features of XTCS is its relatively small program, organized around a single track of talks. The CFP makes it clear that this small number of accepted talks is the result of a design decision, rather than out of necessity (e.g. too few good submitted papers). There are many benefits to having a small program. One of the main such benefits is the indirect pressure it puts on the PC: less accepted talks means a higher strain on “getting it right”. As an attendee, it means I can trust that the SC and PC have put together a strong, pointed program, and divided it into coherent sessions; the speakers, assured of a broad audience, have carefully prepared their talks; their public is curious and focused. Even if I can’t immediately parse the title of the talks I still attend most of them. At XTCS the chance that I will discover a new topic, a new conceptual idea, a new bridge between areas, is much higher — and this is precisely because there are fewer, better-selected talks.

In addition, XTCS has a small number of specially selected (either taken from the best submissions or by invitation) longer talks, perhaps 45-minute or an hour-long. These talks are one of the main reasons I attend the conference: they are guaranteed to be of high quality and present novel ideas; they are the kind of talks that no paper-reading can replace. Similar to some of the “prize talks” at STOC or FOCS (indeed I was surprised to discover that ITCS did not have a single such broader talk, apart from the after-dinner talk by Peter Winkler), these can be given by reputable speakers from our field, as well as by speakers from neighboring fields who have something to tell us. Selecting such talks is the business of the SC; in fact it is one of their most important tasks in putting a successful conference together: identify novel ideas, promising directions, that will shape the future of the field.

2. Computation as a lens

XTCS has a purposefully broader scope than other conferences in theoretical computer science. This is in keeping with the “computation as a lens on the sciences” motto that is increasingly adopted, in various guises, by TCS communities (see e.g. Berkeley’s Simons institute mission statement). If anything, this motto best embodies the future of our field; XTCS is its conference. Which doesn’t mean that “purely TCS” papers should be excluded; only that papers accepted to the conference will be required to have a bearing outside of themselves. For instance, a paper whose main result is “we prove that the permanent does not have polynomial-size circuits” might not be accepted, whereas a paper claiming “we show that not all efficiently verifiable assertions can be efficiently decided” might be. The example is a a bit extreme and I don’t mean it to be taken literally, but I chose it for two reasons: first, to emphasize that an “application to the physical sciences” is by no means necessary, and second, that even though the difference is only one of formulation, formulation matters (after all, a conference is all about presentation!). Here the second formulation emphasizes the significance of the result; the first focuses only on the technique that is used to obtain it.

Another way XTCS encourages inter-disciplinarity and avoids being overly self-centered is by remaining as international as possible. XTCS strives to avoid the kind of “inner circle” complaints that are sometimes made about STOC or FOCS (be they justified or not), and keeping its audience as geographically diverse as possible helps. The reason I have most often heard for locating a conference in the US are the travel costs, but I don’t think this is really justified: accommodation can be cheaper in Europe (depending on location), and the extra cost in terms of airfare remains reasonable; compared to the total cost of attending it represents perhaps a 20 or 30% increase. Financial support for students is a priority: registration costs can be increased for faculty, and decreased for students and postdocs.

3. Community building

In addition to the importance of the wording used in the CFP, which is given more prominence than at “traditional” conferences, XTCS takes a few small but concrete steps towards reinforcing the sensation of a “TCS community”. One such step is the organization of the sessions, in which the chair is given a slightly more important role than usual: she is responsible for saying a few words, either about the importance of each work, or about the session’s topic as a whole, and for asking pointed questions that will encourage discussion. There may even be a “rump session” in which students are encouraged to present their latest results, and throughout which one is generally encouraged to discuss half-baked ideas. The rump session can be paired with a poster session around which more serious discussions can be pursued.

Ample time is given to coffee breaks and lunches. The conference location is chosen carefully so as to allow for interactions ranging from casual discussion to focused research discussions. Large hotels in city centers are avoided (to avoid the crowd dispersing too easily); university halls are preferred (this can also help keep the costs down).

4. Submission format and proceedings

Submissions to XTCS are short; perhaps three to five pages. This forces the authors to present the main point of their research in an accessible way. The five-page abstract should be accompanied by a more technical version of the paper, which can be a pointer to the arXiv. Accepted papers are required to have their full version made available online by the time of the conference. XTCS has no published proceedings. Instead, it has a well-maintained website which lists each year’s accepted papers, together with their five-page abstracts and pointers to the online version.

Will you come to XTCS?

Posted in Conferences | Tagged , | 2 Comments

Where on earth?

Tricky one!


Posted in Tourism, Where on earth | 4 Comments

Prospective students & postdocs: apply to Caltech!

It’s job application season again! While I happily suffered  through the application process in Fall 2006, Fall 2010 and Fall 2012, this year I will be participating in the festivities from the other side of the divide. My message is simple: all of you brilliant (aspiring to brilliance is also good), young (of the mind) computer scientists, mathematicians, physicists or otherwise, interested in theoretical computer science, discrete math, quantum information and related topics, apply to Caltech — you couldn’t dream a better place in where to set up and develop your research! (Not to mention the beach, mountains, movies, and weather of course.)

I will be extremely fortunate to be among two joining the faculty of the world no.1 university starting June 2014 as an assistant professor of computer science in the Department of Computing + Mathematical Sciences (CMS) and am looking forward to some great applicants this year, both at the Ph.D. and postdoc level. I encourage anyone interested to navigate through the websites of Caltech’s CMS, Center for the Mathematics of Information (CMI), Institute for Quantum Information and Matter (IQIM), and Physics department. (You should also check out IQIM’s excellent blog for a lively snapshot of what its members are up to.) You’ll find that, in spite of its small size, the interests of Caltech faculty span an impressive range, and I hope that, just like I did, you’ll suddenly realize that you need to get over here right away and start talking to these people!

Concretely, here is what you can do:

Undergraduate students applying for Ph.D. Apply directly on the Caltech admissions website. The deadline for applications is Dec. 15th.

Ph.D. students or postdocs applying for postdoctoral positions. Both CMI and IQIM, institutes in which CMS faculty take part, have openings for postdoc positions next year. For CMI, applications are due Dec. 20th here. For IQIM, applications are due Dec. 6th here.

I hope you’ll consider us, and look forward to welcoming some of you next Fall!


Posted in Caltech, Jobs | Tagged , | Leave a comment

Quantum marginals

As I mentioned in the last post, last week Mathias Christandl, Michael Walter and Andreas Winter were organizing a workshop entitled “Quantum Marginals” as part of the semester-long programme Mathematical Challenges in Quantum Information at the Newton Institute. In his introductory talk Michael emphasized that the quantum marginal problem can be approached from a wide array of areas of mathematics and computer science, ranging from information theory to complexity theory through representation theory, algebraic geometry, random matrix theory, convex analysis and more. In this post I will first attempt to explain what the quantum marginal problem is, and then describe two or three of these connections, following some of the talks held during the workshop.

The classical marginal problem

The marginal problem can already be stated as an interesting problem on classical distributions. Broadly speaking, the question is the following: which families of distributions, defined on possibly intersecting sets of variables, can be realized as the marginals of a joint underlying distribution? More precisely, for integers {n} and {d} suppose given a family of distributions {\{p_{S_i}\}} defined on {\{1,\ldots,d\}^{S_i}} respectively, where {S_i \subset\{1,\ldots, n\}}. When does there exist a distribution {p} on {\{1,\ldots,d\}^n} such that for every {i}, {p_{S_i}} is obtained as the marginal of {p} on the variables in {S_i}?

While the problem is trivial if {|S_i|=1} for every {i}, this seems to be essentially the only easy case. Consider for instance the problem when each {|S_i|\leq 2}. Clearly, a necessary condition for the existence of {p} is that marginals sharing a same variable are consistent. But this is not sufficient: for instance, one can derive additional constraints by expanding out an expression such as {\text{Var}(x_1+x_2+x_3)\geq 0} in terms of pairwise correlations only, placing further constraints on the marginals. In fact, it is easy to show that the decision problem for {|S_i|\leq 2} is NP-hard; I’ll leave the simple reduction from any two-local constraint satisfaction problem as an exercise.

The marginal problem has a long history (Klyachko traces it back to work of Hoefling in the 40s), but it seems that very few general statements are known. I found it hard to find modern references on the topic; maybe this is because, as the foreword to the proceedings of a (somewhat) recent conference on the topic put it,

“The subject matter of these conference proceedings comes in many guises. Some view it as the study of probability distributions with fixed marginals; those coming to the subject from probabilistic geometry see it as the study of copulas; experts in real analysis think of it as the study of doubly stochastic measures; functional analysts think of it as the study of Markov operators; and statisticians say it is the study of possible dependence relations between pairs of random variables. All are right since all these topics are isomorphic.”

Among all these points of view the closest to us is probably that taken by statisticians, who think more generally of the marginal problem as one of causal inference: given empirical observations of the correlations between certain subsets of a collection of random variables, can one reconstruct links of causal effect between the random variables (for instance by constructing a graphical model that reproduces the observed correlations)? The latter is a topic of very active research in machine learning research, where one tries to identify the most relevant variables based only on partial information about their joint distribution (motivating scenario for this abound in e.g. biology).

The quantum marginal problem

The quantum marginal problem, in turn, asks about the following: given a set of density matrices {\rho_{S_i}} on {\otimes_{j\in S_i} {\mathbb C}^d} (i.e. {d^{|S_i|}\times d^{|S_i|}} positive semidefinite matrices with trace {1}), where as before each {S_i\subset\{1,\ldots,n\}}, when does there exist a joint density matrix {\rho} on {({\mathbb C}^d)^{\otimes n}} having all the {\rho_{S_i}} as its reduced densities (i.e. the partial trace {{\mbox{\rm Tr}}_{\overline{S_i}}(\rho) = \rho_{S_i}})? While in classical scenario the assumption that only certain marginal distributions are observed may seem like an artificial constraint imposed by so-called “experimental conditions”, in quantum theory it arises very naturally, as we are used to the fact that certain observables can be jointly measured (e.g. measurements on isolated particles) while others cannot (e.g. position and momentum of the same electron).

A good deal of motivation for studying the quantum marginal problem comes from a particular variant called the {N}-representability problem. The latter can be traced back to some observations made by quantum chemists in the 40s, though its importance was first properly understood in the 60s, when it was first explicitly stated as a major open problem by Coulson (see Chapter 1 of the book Reduced Density Matrices for more historical background). In modern terms, the {N}-representability problem asks the following: given a single two-body reduced density matrix {\rho_{12}} on {{\mathbb C}^d\otimes {\mathbb C}^d}, when does there exist an {N}-body density {\rho} on {({\mathbb C}^d)^{\otimes N}} that is supported on the antisymmetric subspace of {({\mathbb C}^2)^{\otimes N}} and has {\rho_{12}} as reduced density on any two systems? (Note that the condition that {\rho} is antisymmetric is important to make the problem nontrivial, as otherwise of course any single reduced density can be realized as the partial trace of a larger density matrix.)

The relevance of the {N}-representability problem for quantum chemistry relates to the fact that many elementary chemical systems can be characterized as a system of {N} identical particles subject to permutation-invariant two-body interactions. The particles will often be fermions, which are subject to the Pauli exclusion principle and thus follow the fermionic exchange rule: the state of a system obtained by exchanging two fermions picks up a minus sign (hence the state must be {0} as soon as here are two identical particles — apparently, the idea of relating the exclusion principle to antisymmetry dates back to Heisenberg and Dirac). This explains the condition of antisymmetry placed on {\rho}. Next, the assumption that the {N} particles are subject to permutation-invariant two-body interactions implies that the Hamiltonian of the system can be expressed as {H = \sum_{i,j} H_{i,j}}, where here all {H_{i,j}} terms are identical (except they act on different pairs of particles {i,j}). This permutation-invariance then implies that the energy of the system can be computed as {\min_{\rho_{12}} {\mbox{\rm Tr}}(H_{12}\rho_{12})}, where particles {1} and {2} are arbitrarily chosen representatives. This is a huge bounty for chemistry: one can apparently compute the minimal energy of a complex system of {N} interacting particles by solving a constant-size problem!

Of course there is a hitch. Looking at it more closely, we only shifted the difficulty, from optimizing over a simple but exponentially large set ({N}-particle antisymmetric density matrices) to that of a small but highly complex set ({2}-particle density matrices that can be realized as the reduced density of an antisymmetric {N}-particle density matrix). As it turns out, there is not much hope for getting around the difficulty: just a few years ago the {N}-representability problem was shown QMA-hardby Yi-Kai Liu, this in spite of encouraging numerical evidence from the chemists, who have successfully developed a range of practical algorithms for specific instances of the problem.

The quantum marginal problem itself gained prominence in the quantum information literature in the early 2000s, stimulated by questions about the type of entanglement that can arise in bipartite, and more generally multipartite, quantum states. (The discovery that entanglement becomes more and more constrained as the number of particles, or sub-systems, increases, nowadays identified as themonogamy of entanglement, has had — and continues to have — major repercussions in all areas in which quantum mechanics plays a role, from condensed matter physics to quantum computing through cryptography and the physics of black holes.) Soon after Higuchi and Sudbery and Linden et al. discovered explicit examples of incompatible sets of reduced density matrices for three-qubit systems, Higuchi et al. and Bravyi were apparently the first to give a complete characterization of the set of single-qubit reduced density matrices that can arise from a single pure {n}-qubit state: the condition is that the smallest eigenvalue of each reduced density is at most the sum of the smallest eigenvalues of all other reduced densities. Note how there already arises a crucial distinction with the classical setting, where a “pure” (deterministic) distribution on {n} bits always gives rise to “pure” marginals, and vice-versa. In contrast, in the quantum case “maximal knowledge of the whole does not necessarily include the maximal knowledge of its parts.” (Schrödinger; quote stolen from a nice review on the topic by Klyachko), and the problem of determining even which single-particle densities can be obtained from a global pure state is highly non-trivial (of course if one allows the global state to be mixed as well then here as in the classical case one has to include at least some two-particle reduced densities for the problem to become non-trivial).

Physical motivations aside, the quantum marginal problem is fundamentally one of convex geometry: to each marginal density one can associate a convex set of global densities that reduce to it, and the question is to characterize the cases when the intersection of those sets is non-empty. It is, however, such a complex and rich question that, depending on the type of settings one wishes to focus on (two systems, large dimension? qubits, many systems? generic scenario or highly specific constraints? constrained or unconstrained global states? etc.) the problem is best approached by (and of interest to) different areas of mathematics. I will give three examples taken from talks given during the workshop.

Constraints on the spectrum: representation theory

Strong connections between the quantum marginal problem and algebraic geometry and representation theory already arise when one considers the following variant of the two-body scenario: suppose {\rho_{AB}} is a density matrix on {{\mathbb C}^d\otimes {\mathbb C}^d} with marginals {\rho_A} on the first copy of {{\mathbb C}^d} and {\rho_B} on the second. Which vectors of eigenvalues {\vec{\lambda}_A} for {\rho_A}, {\vec{\lambda}_B} for {\rho_B}, and {\vec{\lambda}_{AB}} for the global density on {\rho_{AB}}, are compatible? As mentioned above, this is fundamentally a question of geometry: by considering all possible “lifts” of the {\vec{\lambda}_A}, {\vec{\lambda}_B} respectively we can view all three sets as convex subsets of (the unit simplex in) {{\mathbb R}^{d^2}}, and the question is when these sets intersect.

In his talk Aram Harrow sketched how one could approach this question using the representation theory of the symmetric group and Schur-Weyl duality (see also this paper by Klyachko). Because the spectrum of the reduced densities is invariant under the action of the unitary group on either subsystem, we can decompose the space {{\mathbb C}^d\otimes {\mathbb C}^d} on which {\rho_{AB}} lives as a direct sum of the tensor product of irreducible representations corresponding to this action: {{\mathbb C}^d\otimes {\mathbb C}^d=\oplus_{\lambda,\mu} H_{\lambda} \otimes H_{\mu} \otimes {\mathbb C}^{n_{\lambda}n_\mu}}, where {\lambda} (resp. {\mu}) ranges over labels for irreducible representations of the unitary group on {H_A} (resp. {H_B}), {n_\lambda} (resp. {n_\mu}) is the corresponding multiplicity, and {\rho_{AB}} is block-diagonal in this decomposition. Since the tensor product of the unitary groups acting on the two copies of {{\mathbb C}^d} is a subgroup of the full unitary group, each factor {H_{\lambda} \otimes H_{\mu}} itself decomposes into a direct sum of irreducible representations {H_\nu} for the full unitary group. The multiplicity with which {H_\nu} appears in the decomposition of {H_{\lambda} \otimes H_{\mu}} is called the Kronecker coefficient {g(\lambda,\mu,\nu)}. And now comes the magic: it turns out — although unfortunately Aram didn’t have time to explain this — that compatible sets of eigenvalues for {\rho_{AB}}, {\rho_A} and {\rho_B} bear a tight connection to those triples {(\lambda,\mu,\nu)} for which the Kronecker coefficient is non-zero; somehow there is a way to read the latter as precisely the eigenvalues themselves (up to some normalization), though I admit how this works goes beyond my understanding of the situation (Klyachko refers to this paper of his for the resolution of a closely related problem on the spectrum of sums of Hermitian matrices, and to Heckman’s theorem for the present problem, but I am not familiar with either).

Aram also showed how one could throw a little Schur-Weyl duality into the picture: assuming the density {\rho_{AB}} is invariant under permutation of the two subsystems (hence {\rho_A=\rho_B}), we can consider not only the action of the unitary group on {H_A} and {H_B} but also that of the group of permutations of the subsystems ({\mathcal{S}_2} in the case of two subsystems). Since this action commutes with that of the unitary group on individual factors one gets a more refined decomposition of {{\mathbb C}^d\otimes {\mathbb C}^d} into irreducible representations, in which for instance all the multiplicities are {0} or {1}. Again, studying which representations occur leads to further constraints on the possible marginal spectra.

The generic case: random matrix theory

What can we say about the marginal problem for the case of a “generic” state? Here by generic we simply mean a state generated according to a natural random distribution. Guillaume Aubrun and Stanislaw Szarek gave talks presenting results, obtained with Deping Ye (see also the short exposition), in which they investigated the following scenario: choose a pure state on {{\mathbb C}^d \otimes {\mathbb C}^d \otimes {\mathbb C}^s} at random (according to the Haar measure); how entangled is its reduced state on {{\mathbb C}^d \otimes {\mathbb C}^d}? They can show the existence of a very sharp threshold: there is a constant {c>0} such that, for any {\epsilon>0}, if {s<(c-\epsilon)d^{3/2}} then with high probability (as {d\rightarrow\infty}) the state is entangled; if {s>(1+\epsilon)cd^{3/2}} then it is separable. Note that the extremes make sense: when {s} is small we do expect there to be entanglement between the two copies of {{\mathbb C}^d} (e.g. if {s=0} the state is a random pure state, which will be close to maximally entangled), whereas if {s} becomes very large we expect the reduced density on {{\mathbb C}^d\otimes {\mathbb C}^d} to be close to the totally mixed state, and the state is separable.

How to prove such a result? Guillaume and Stanislaw took the time to give us a very detailed sketch of the proof, and I highly recommend watching their respective talks. In the first talk Guillaume proved the “easy” part, which is the large {s} regime, and in the second talk Stanislaw showed how Guillaume’s arguments could be refined to obtain a sharp threshold.

The main idea is the following. Let {\mathcal{S}} be the set of separable states in {{\mathbb C}^n = {\mathbb C}^d\otimes {\mathbb C}^d}, and {\rho\mapsto\|\rho\|_{\mathcal{S}}} denote the gauge of {\mathcal{S}}, the largest {t>0} such that {t\rho\in\mathcal{S}}. Our goal can be reformulated as a two-step process: 1) show that {\|\rho\|_{\mathcal{S}}} exhibits a sharp threshold: there is a value {s_0} such that, if {s\geq (1+\epsilon)s_0} (resp. {s\leq (1+\epsilon)s_0}) then {\|\rho\|_{\mathcal{S}}\ll 1}, i.e. the state is separable (resp. {\|\rho\|_{\mathcal{S}}\gg 1}, i.e. the state is entangled), with high probability, and 2) compute the expectation {\textsc{E}[\|\rho\|_{\mathcal{S}}]}, as a function of {s}, and determine as precisely as possible the {s_0} for which it is close to {1} (by point 1) we already know there will be a sharp transition).

Let’s focus on 1), as similar ideas are used for 2). Concentration of the function {\rho\mapsto\|\rho\|_{\mathcal{S}}} around its median follows from “standard” arguments in concentration of measure, so the interesting part consists in showing the existence of an {s_0} such that, for {s} larger (resp. smaller) than {s_0} the median is much smaller (resp. much larger) than {1}. The starting point for this consists in, first, arguing that the distribution of {\rho - I/n} is very well approximated by that of a random matrix {A} taken from the Gaussian Unitary Ensemble (GUE) (entries are i.i.d complex Gaussians, with, in this case, standard deviation {1/(n\sqrt{s})}), and second, relating {\textsc{E}[\|A\|_{\mathcal{S}}]} to the mean width {w(\mathcal{S}_0)}, where {\mathcal{S}_0 = \mathcal{S}-\{I/n\}}. The mean width intuitively measures precisely what its name indicates: the average maximum distance (measured using the distance derived from the Hilbert-Schmidt inner product used to define the polar {\mathcal{S}^\circ_0} of {\mathcal{S}_0}, which plays an important role in the remainder of the argument) of elements of {\mathcal{S}_0} from its centroid. In order to estimate this mean width some fairly involved arguments from convex geometry, involving the use of the polar {\mathcal{S}^\circ_0}, kick in, but I shall refer you to the talks for more detail.

I found the structure of the argument very elegant because it starts with what, at first, looks (at least to me) like a completely intractable question — when is a random state separable?; uses concentration of measure to reformulate it as what now looks like a straightforward calculation of expectation — compute {\textsc{E}[\|\rho\|_{\mathcal{S}}]}; relates this expectation to a geometric parameter of {\mathcal{S}}, its mean width, which is fairly intuitive; and finally, uses a sequence of tools from the analysis of convex bodies to estimate the width (note this last step is the only one in which the precise nature of the set of separable states comes into place, the others applying to any “nice enough” convex body). And the result is surprisingly sharp: the fact that the threshold {s_0} for {s} is tight at least up to multiplicative {1+o(1)} factors implies that simply giving one more qubit — just one, out of {s}, which can be arbitrarily large — to the environment marks the transition from states that are entangled with high probability, to states that are separable with high probability.

Causal inference: information theory

In his talk David Gross drew a larger picture around the quantum marginal problem by framing it in the context of causal inference. I already mentioned this classical problem: given a restricted set of observations (= some marginal distributions), can one reconstruct links of causation between the variables? Here we may as well assume that the marginals are a priori consistent with a global distribution (in fact they could be consistent with many global distributions); rather the goal is to infer properties of any matching global distribution just from the marginals themselves. Typically, one may desire to construct a directed acyclic graph whose nodes are the variables and edges indicate causation: the distribution of a given node should be independent of its ancestors when one conditions on the parents.

Unfortunately conditional independence imposes complicated (quadratic) constraints on the marginals, leading to a computationally hard inference problem. So instead of thinking of the random variables and their distributions as the unknowns of the problem, David advocated the use of the mutual information between subsets of variables as the fundamental unknown. A major advantage of this approach is that it turns conditional independence into a straightforward equality: {I(X:Y|Z)=0}. Moreover, it is easy to relax the condition to {I(X:Y|Z)\leq \epsilon}, adding robustness to the scenario while preserving linearity. Note however that, although the inequalities themselves are simple to write down, deciding whether a given set of inequalities is satisfiable or not remains a potentially hard problem, as in order to combine different inequalities to check compatibility one may need to expand each of the mutual information terms to take into account more and more variables, eventually requiring the introduction of all {2^n} subset entropies. This approach is also a relaxation of the original problem, in the sense that not all information from the marginal distributions is captured in the entropies. Nevertheless, David demonstrated that in some interesting cases the marginal entropies suffice to distinguish whether a matching global distribution with some interesting property exists or not (his example was the question of whether a distribution on three variables requires the existence of a common (hidden) “ancestor”, or whether the three variables could have been pairwise generated from three independent ancestors).

Putting issues of causality aside, this approach suggests a similar relaxation of the marginal problem: before asking whether a set of marginals is consistent, one could start by asking the (in general weaker) question of whether the induced set of marginal entropies is consistent. Interestingly, one can apply this idea to the study of entropic versions of the Bell inequalities. Such study was apparently pioneered by Braunstein and Caves, who obtained the following “entropic CHSH inequality” :

\displaystyle H(A_0B_0)+H(A_0B_1)+H(A_1B_0)-H(A_1B_1) \geq -2.

(Note the mysterious mirror symmetry with the CHSH inequality itself… see the explanations in David Gross’s paper for explanations on its origin.) An advantage of this approach, explored in David’s paper in the setting of a different Bell inequality, is that it naturally allows one to introduce additional restrictions on the correlations between the random variables; he and co-authors consider e.g. the condition that they are generated using a certain limited amount of shared randomness.

Unfortunately David did not get to discussing the possibility of deriving entropic analogues of Tsirelson inequalities. Presumably these would still quantify the Shannon entropy of the outcome distribution of the result of applying arbitrary observables on an arbitrary state, with the additional restriction that some of the observables must be made on different subsystems. Or one could imagine inequalities involving the von Neumann entropy of the state itself, although I don’t see how the possibility for using different settings would be taken into account (and how a dependence on the dimension of the state would be avoided). Deriving such inequalities may not be an easy task… Indeed, I believe the marginal problem for quantum von Neumann entropies (finding a complete set of inequalities that characterize the cone of {2^n}-dimensional vectors of marginal entropies that are mutually consistent) is even less well understood than its classical counterpart.

I only discussed four of the many more talks accessible from the workshop’s online programme — go have a look! The third (and last) workshop taking place as part of the Newton semester will be a workshop on “Quantum Algorithms”, organized by Steve Brierley, Richard Jozsa and Andreas Winter in late November.

Posted in Conferences, Quantum, Talks | Tagged , , , , | 4 Comments