## Where on earth?

Tricky one!

Posted in Tourism, Where on earth | 4 Comments

## Prospective students & postdocs: apply to Caltech!

It’s job application season again! While I happily suffered  through the application process in Fall 2006, Fall 2010 and Fall 2012, this year I will be participating in the festivities from the other side of the divide. My message is simple: all of you brilliant (aspiring to brilliance is also good), young (of the mind) computer scientists, mathematicians, physicists or otherwise, interested in theoretical computer science, discrete math, quantum information and related topics, apply to Caltech — you couldn’t dream a better place in where to set up and develop your research! (Not to mention the beach, mountains, movies, and weather of course.)

I will be extremely fortunate to be among two joining the faculty of the world no.1 university starting June 2014 as an assistant professor of computer science in the Department of Computing + Mathematical Sciences (CMS) and am looking forward to some great applicants this year, both at the Ph.D. and postdoc level. I encourage anyone interested to navigate through the websites of Caltech’s CMS, Center for the Mathematics of Information (CMI), Institute for Quantum Information and Matter (IQIM), and Physics department. (You should also check out IQIM’s excellent blog for a lively snapshot of what its members are up to.) You’ll find that, in spite of its small size, the interests of Caltech faculty span an impressive range, and I hope that, just like I did, you’ll suddenly realize that you need to get over here right away and start talking to these people!

Concretely, here is what you can do:

Undergraduate students applying for Ph.D. Apply directly on the Caltech admissions website. The deadline for applications is Dec. 15th.

Ph.D. students or postdocs applying for postdoctoral positions. Both CMI and IQIM, institutes in which CMS faculty take part, have openings for postdoc positions next year. For CMI, applications are due Dec. 20th here. For IQIM, applications are due Dec. 6th here.

I hope you’ll consider us, and look forward to welcoming some of you next Fall!

## Quantum marginals

As I mentioned in the last post, last week Mathias Christandl, Michael Walter and Andreas Winter were organizing a workshop entitled “Quantum Marginals” as part of the semester-long programme Mathematical Challenges in Quantum Information at the Newton Institute. In his introductory talk Michael emphasized that the quantum marginal problem can be approached from a wide array of areas of mathematics and computer science, ranging from information theory to complexity theory through representation theory, algebraic geometry, random matrix theory, convex analysis and more. In this post I will first attempt to explain what the quantum marginal problem is, and then describe two or three of these connections, following some of the talks held during the workshop.

The classical marginal problem

The marginal problem can already be stated as an interesting problem on classical distributions. Broadly speaking, the question is the following: which families of distributions, defined on possibly intersecting sets of variables, can be realized as the marginals of a joint underlying distribution? More precisely, for integers ${n}$ and ${d}$ suppose given a family of distributions ${\{p_{S_i}\}}$ defined on ${\{1,\ldots,d\}^{S_i}}$ respectively, where ${S_i \subset\{1,\ldots, n\}}$. When does there exist a distribution ${p}$ on ${\{1,\ldots,d\}^n}$ such that for every ${i}$, ${p_{S_i}}$ is obtained as the marginal of ${p}$ on the variables in ${S_i}$?

While the problem is trivial if ${|S_i|=1}$ for every ${i}$, this seems to be essentially the only easy case. Consider for instance the problem when each ${|S_i|\leq 2}$. Clearly, a necessary condition for the existence of ${p}$ is that marginals sharing a same variable are consistent. But this is not sufficient: for instance, one can derive additional constraints by expanding out an expression such as ${\text{Var}(x_1+x_2+x_3)\geq 0}$ in terms of pairwise correlations only, placing further constraints on the marginals. In fact, it is easy to show that the decision problem for ${|S_i|\leq 2}$ is NP-hard; I’ll leave the simple reduction from any two-local constraint satisfaction problem as an exercise.

The marginal problem has a long history (Klyachko traces it back to work of Hoefling in the 40s), but it seems that very few general statements are known. I found it hard to find modern references on the topic; maybe this is because, as the foreword to the proceedings of a (somewhat) recent conference on the topic put it,

“The subject matter of these conference proceedings comes in many guises. Some view it as the study of probability distributions with fixed marginals; those coming to the subject from probabilistic geometry see it as the study of copulas; experts in real analysis think of it as the study of doubly stochastic measures; functional analysts think of it as the study of Markov operators; and statisticians say it is the study of possible dependence relations between pairs of random variables. All are right since all these topics are isomorphic.”

Among all these points of view the closest to us is probably that taken by statisticians, who think more generally of the marginal problem as one of causal inference: given empirical observations of the correlations between certain subsets of a collection of random variables, can one reconstruct links of causal effect between the random variables (for instance by constructing a graphical model that reproduces the observed correlations)? The latter is a topic of very active research in machine learning research, where one tries to identify the most relevant variables based only on partial information about their joint distribution (motivating scenario for this abound in e.g. biology).

The quantum marginal problem

The quantum marginal problem, in turn, asks about the following: given a set of density matrices ${\rho_{S_i}}$ on ${\otimes_{j\in S_i} {\mathbb C}^d}$ (i.e. ${d^{|S_i|}\times d^{|S_i|}}$ positive semidefinite matrices with trace ${1}$), where as before each ${S_i\subset\{1,\ldots,n\}}$, when does there exist a joint density matrix ${\rho}$ on ${({\mathbb C}^d)^{\otimes n}}$ having all the ${\rho_{S_i}}$ as its reduced densities (i.e. the partial trace ${{\mbox{\rm Tr}}_{\overline{S_i}}(\rho) = \rho_{S_i}}$)? While in classical scenario the assumption that only certain marginal distributions are observed may seem like an artificial constraint imposed by so-called “experimental conditions”, in quantum theory it arises very naturally, as we are used to the fact that certain observables can be jointly measured (e.g. measurements on isolated particles) while others cannot (e.g. position and momentum of the same electron).

A good deal of motivation for studying the quantum marginal problem comes from a particular variant called the ${N}$-representability problem. The latter can be traced back to some observations made by quantum chemists in the 40s, though its importance was first properly understood in the 60s, when it was first explicitly stated as a major open problem by Coulson (see Chapter 1 of the book Reduced Density Matrices for more historical background). In modern terms, the ${N}$-representability problem asks the following: given a single two-body reduced density matrix ${\rho_{12}}$ on ${{\mathbb C}^d\otimes {\mathbb C}^d}$, when does there exist an ${N}$-body density ${\rho}$ on ${({\mathbb C}^d)^{\otimes N}}$ that is supported on the antisymmetric subspace of ${({\mathbb C}^2)^{\otimes N}}$ and has ${\rho_{12}}$ as reduced density on any two systems? (Note that the condition that ${\rho}$ is antisymmetric is important to make the problem nontrivial, as otherwise of course any single reduced density can be realized as the partial trace of a larger density matrix.)

The relevance of the ${N}$-representability problem for quantum chemistry relates to the fact that many elementary chemical systems can be characterized as a system of ${N}$ identical particles subject to permutation-invariant two-body interactions. The particles will often be fermions, which are subject to the Pauli exclusion principle and thus follow the fermionic exchange rule: the state of a system obtained by exchanging two fermions picks up a minus sign (hence the state must be ${0}$ as soon as here are two identical particles — apparently, the idea of relating the exclusion principle to antisymmetry dates back to Heisenberg and Dirac). This explains the condition of antisymmetry placed on ${\rho}$. Next, the assumption that the ${N}$ particles are subject to permutation-invariant two-body interactions implies that the Hamiltonian of the system can be expressed as ${H = \sum_{i,j} H_{i,j}}$, where here all ${H_{i,j}}$ terms are identical (except they act on different pairs of particles ${i,j}$). This permutation-invariance then implies that the energy of the system can be computed as ${\min_{\rho_{12}} {\mbox{\rm Tr}}(H_{12}\rho_{12})}$, where particles ${1}$ and ${2}$ are arbitrarily chosen representatives. This is a huge bounty for chemistry: one can apparently compute the minimal energy of a complex system of ${N}$ interacting particles by solving a constant-size problem!

Of course there is a hitch. Looking at it more closely, we only shifted the difficulty, from optimizing over a simple but exponentially large set (${N}$-particle antisymmetric density matrices) to that of a small but highly complex set (${2}$-particle density matrices that can be realized as the reduced density of an antisymmetric ${N}$-particle density matrix). As it turns out, there is not much hope for getting around the difficulty: just a few years ago the ${N}$-representability problem was shown QMA-hardby Yi-Kai Liu, this in spite of encouraging numerical evidence from the chemists, who have successfully developed a range of practical algorithms for specific instances of the problem.

The quantum marginal problem itself gained prominence in the quantum information literature in the early 2000s, stimulated by questions about the type of entanglement that can arise in bipartite, and more generally multipartite, quantum states. (The discovery that entanglement becomes more and more constrained as the number of particles, or sub-systems, increases, nowadays identified as themonogamy of entanglement, has had — and continues to have — major repercussions in all areas in which quantum mechanics plays a role, from condensed matter physics to quantum computing through cryptography and the physics of black holes.) Soon after Higuchi and Sudbery and Linden et al. discovered explicit examples of incompatible sets of reduced density matrices for three-qubit systems, Higuchi et al. and Bravyi were apparently the first to give a complete characterization of the set of single-qubit reduced density matrices that can arise from a single pure ${n}$-qubit state: the condition is that the smallest eigenvalue of each reduced density is at most the sum of the smallest eigenvalues of all other reduced densities. Note how there already arises a crucial distinction with the classical setting, where a “pure” (deterministic) distribution on ${n}$ bits always gives rise to “pure” marginals, and vice-versa. In contrast, in the quantum case “maximal knowledge of the whole does not necessarily include the maximal knowledge of its parts.” (Schrödinger; quote stolen from a nice review on the topic by Klyachko), and the problem of determining even which single-particle densities can be obtained from a global pure state is highly non-trivial (of course if one allows the global state to be mixed as well then here as in the classical case one has to include at least some two-particle reduced densities for the problem to become non-trivial).

Physical motivations aside, the quantum marginal problem is fundamentally one of convex geometry: to each marginal density one can associate a convex set of global densities that reduce to it, and the question is to characterize the cases when the intersection of those sets is non-empty. It is, however, such a complex and rich question that, depending on the type of settings one wishes to focus on (two systems, large dimension? qubits, many systems? generic scenario or highly specific constraints? constrained or unconstrained global states? etc.) the problem is best approached by (and of interest to) different areas of mathematics. I will give three examples taken from talks given during the workshop.

Constraints on the spectrum: representation theory

Strong connections between the quantum marginal problem and algebraic geometry and representation theory already arise when one considers the following variant of the two-body scenario: suppose ${\rho_{AB}}$ is a density matrix on ${{\mathbb C}^d\otimes {\mathbb C}^d}$ with marginals ${\rho_A}$ on the first copy of ${{\mathbb C}^d}$ and ${\rho_B}$ on the second. Which vectors of eigenvalues ${\vec{\lambda}_A}$ for ${\rho_A}$, ${\vec{\lambda}_B}$ for ${\rho_B}$, and ${\vec{\lambda}_{AB}}$ for the global density on ${\rho_{AB}}$, are compatible? As mentioned above, this is fundamentally a question of geometry: by considering all possible “lifts” of the ${\vec{\lambda}_A}$, ${\vec{\lambda}_B}$ respectively we can view all three sets as convex subsets of (the unit simplex in) ${{\mathbb R}^{d^2}}$, and the question is when these sets intersect.

In his talk Aram Harrow sketched how one could approach this question using the representation theory of the symmetric group and Schur-Weyl duality (see also this paper by Klyachko). Because the spectrum of the reduced densities is invariant under the action of the unitary group on either subsystem, we can decompose the space ${{\mathbb C}^d\otimes {\mathbb C}^d}$ on which ${\rho_{AB}}$ lives as a direct sum of the tensor product of irreducible representations corresponding to this action: ${{\mathbb C}^d\otimes {\mathbb C}^d=\oplus_{\lambda,\mu} H_{\lambda} \otimes H_{\mu} \otimes {\mathbb C}^{n_{\lambda}n_\mu}}$, where ${\lambda}$ (resp. ${\mu}$) ranges over labels for irreducible representations of the unitary group on ${H_A}$ (resp. ${H_B}$), ${n_\lambda}$ (resp. ${n_\mu}$) is the corresponding multiplicity, and ${\rho_{AB}}$ is block-diagonal in this decomposition. Since the tensor product of the unitary groups acting on the two copies of ${{\mathbb C}^d}$ is a subgroup of the full unitary group, each factor ${H_{\lambda} \otimes H_{\mu}}$ itself decomposes into a direct sum of irreducible representations ${H_\nu}$ for the full unitary group. The multiplicity with which ${H_\nu}$ appears in the decomposition of ${H_{\lambda} \otimes H_{\mu}}$ is called the Kronecker coefficient ${g(\lambda,\mu,\nu)}$. And now comes the magic: it turns out — although unfortunately Aram didn’t have time to explain this — that compatible sets of eigenvalues for ${\rho_{AB}}$, ${\rho_A}$ and ${\rho_B}$ bear a tight connection to those triples ${(\lambda,\mu,\nu)}$ for which the Kronecker coefficient is non-zero; somehow there is a way to read the latter as precisely the eigenvalues themselves (up to some normalization), though I admit how this works goes beyond my understanding of the situation (Klyachko refers to this paper of his for the resolution of a closely related problem on the spectrum of sums of Hermitian matrices, and to Heckman’s theorem for the present problem, but I am not familiar with either).

Aram also showed how one could throw a little Schur-Weyl duality into the picture: assuming the density ${\rho_{AB}}$ is invariant under permutation of the two subsystems (hence ${\rho_A=\rho_B}$), we can consider not only the action of the unitary group on ${H_A}$ and ${H_B}$ but also that of the group of permutations of the subsystems (${\mathcal{S}_2}$ in the case of two subsystems). Since this action commutes with that of the unitary group on individual factors one gets a more refined decomposition of ${{\mathbb C}^d\otimes {\mathbb C}^d}$ into irreducible representations, in which for instance all the multiplicities are ${0}$ or ${1}$. Again, studying which representations occur leads to further constraints on the possible marginal spectra.

The generic case: random matrix theory

What can we say about the marginal problem for the case of a “generic” state? Here by generic we simply mean a state generated according to a natural random distribution. Guillaume Aubrun and Stanislaw Szarek gave talks presenting results, obtained with Deping Ye (see also the short exposition), in which they investigated the following scenario: choose a pure state on ${{\mathbb C}^d \otimes {\mathbb C}^d \otimes {\mathbb C}^s}$ at random (according to the Haar measure); how entangled is its reduced state on ${{\mathbb C}^d \otimes {\mathbb C}^d}$? They can show the existence of a very sharp threshold: there is a constant ${c>0}$ such that, for any ${\epsilon>0}$, if ${s<(c-\epsilon)d^{3/2}}$ then with high probability (as ${d\rightarrow\infty}$) the state is entangled; if ${s>(1+\epsilon)cd^{3/2}}$ then it is separable. Note that the extremes make sense: when ${s}$ is small we do expect there to be entanglement between the two copies of ${{\mathbb C}^d}$ (e.g. if ${s=0}$ the state is a random pure state, which will be close to maximally entangled), whereas if ${s}$ becomes very large we expect the reduced density on ${{\mathbb C}^d\otimes {\mathbb C}^d}$ to be close to the totally mixed state, and the state is separable.

How to prove such a result? Guillaume and Stanislaw took the time to give us a very detailed sketch of the proof, and I highly recommend watching their respective talks. In the first talk Guillaume proved the “easy” part, which is the large ${s}$ regime, and in the second talk Stanislaw showed how Guillaume’s arguments could be refined to obtain a sharp threshold.

The main idea is the following. Let ${\mathcal{S}}$ be the set of separable states in ${{\mathbb C}^n = {\mathbb C}^d\otimes {\mathbb C}^d}$, and ${\rho\mapsto\|\rho\|_{\mathcal{S}}}$ denote the gauge of ${\mathcal{S}}$, the largest ${t>0}$ such that ${t\rho\in\mathcal{S}}$. Our goal can be reformulated as a two-step process: 1) show that ${\|\rho\|_{\mathcal{S}}}$ exhibits a sharp threshold: there is a value ${s_0}$ such that, if ${s\geq (1+\epsilon)s_0}$ (resp. ${s\leq (1+\epsilon)s_0}$) then ${\|\rho\|_{\mathcal{S}}\ll 1}$, i.e. the state is separable (resp. ${\|\rho\|_{\mathcal{S}}\gg 1}$, i.e. the state is entangled), with high probability, and 2) compute the expectation ${\textsc{E}[\|\rho\|_{\mathcal{S}}]}$, as a function of ${s}$, and determine as precisely as possible the ${s_0}$ for which it is close to ${1}$ (by point 1) we already know there will be a sharp transition).

Let’s focus on 1), as similar ideas are used for 2). Concentration of the function ${\rho\mapsto\|\rho\|_{\mathcal{S}}}$ around its median follows from “standard” arguments in concentration of measure, so the interesting part consists in showing the existence of an ${s_0}$ such that, for ${s}$ larger (resp. smaller) than ${s_0}$ the median is much smaller (resp. much larger) than ${1}$. The starting point for this consists in, first, arguing that the distribution of ${\rho - I/n}$ is very well approximated by that of a random matrix ${A}$ taken from the Gaussian Unitary Ensemble (GUE) (entries are i.i.d complex Gaussians, with, in this case, standard deviation ${1/(n\sqrt{s})}$), and second, relating ${\textsc{E}[\|A\|_{\mathcal{S}}]}$ to the mean width ${w(\mathcal{S}_0)}$, where ${\mathcal{S}_0 = \mathcal{S}-\{I/n\}}$. The mean width intuitively measures precisely what its name indicates: the average maximum distance (measured using the distance derived from the Hilbert-Schmidt inner product used to define the polar ${\mathcal{S}^\circ_0}$ of ${\mathcal{S}_0}$, which plays an important role in the remainder of the argument) of elements of ${\mathcal{S}_0}$ from its centroid. In order to estimate this mean width some fairly involved arguments from convex geometry, involving the use of the polar ${\mathcal{S}^\circ_0}$, kick in, but I shall refer you to the talks for more detail.

I found the structure of the argument very elegant because it starts with what, at first, looks (at least to me) like a completely intractable question — when is a random state separable?; uses concentration of measure to reformulate it as what now looks like a straightforward calculation of expectation — compute ${\textsc{E}[\|\rho\|_{\mathcal{S}}]}$; relates this expectation to a geometric parameter of ${\mathcal{S}}$, its mean width, which is fairly intuitive; and finally, uses a sequence of tools from the analysis of convex bodies to estimate the width (note this last step is the only one in which the precise nature of the set of separable states comes into place, the others applying to any “nice enough” convex body). And the result is surprisingly sharp: the fact that the threshold ${s_0}$ for ${s}$ is tight at least up to multiplicative ${1+o(1)}$ factors implies that simply giving one more qubit — just one, out of ${s}$, which can be arbitrarily large — to the environment marks the transition from states that are entangled with high probability, to states that are separable with high probability.

Causal inference: information theory

In his talk David Gross drew a larger picture around the quantum marginal problem by framing it in the context of causal inference. I already mentioned this classical problem: given a restricted set of observations (= some marginal distributions), can one reconstruct links of causation between the variables? Here we may as well assume that the marginals are a priori consistent with a global distribution (in fact they could be consistent with many global distributions); rather the goal is to infer properties of any matching global distribution just from the marginals themselves. Typically, one may desire to construct a directed acyclic graph whose nodes are the variables and edges indicate causation: the distribution of a given node should be independent of its ancestors when one conditions on the parents.

Unfortunately conditional independence imposes complicated (quadratic) constraints on the marginals, leading to a computationally hard inference problem. So instead of thinking of the random variables and their distributions as the unknowns of the problem, David advocated the use of the mutual information between subsets of variables as the fundamental unknown. A major advantage of this approach is that it turns conditional independence into a straightforward equality: ${I(X:Y|Z)=0}$. Moreover, it is easy to relax the condition to ${I(X:Y|Z)\leq \epsilon}$, adding robustness to the scenario while preserving linearity. Note however that, although the inequalities themselves are simple to write down, deciding whether a given set of inequalities is satisfiable or not remains a potentially hard problem, as in order to combine different inequalities to check compatibility one may need to expand each of the mutual information terms to take into account more and more variables, eventually requiring the introduction of all ${2^n}$ subset entropies. This approach is also a relaxation of the original problem, in the sense that not all information from the marginal distributions is captured in the entropies. Nevertheless, David demonstrated that in some interesting cases the marginal entropies suffice to distinguish whether a matching global distribution with some interesting property exists or not (his example was the question of whether a distribution on three variables requires the existence of a common (hidden) “ancestor”, or whether the three variables could have been pairwise generated from three independent ancestors).

Putting issues of causality aside, this approach suggests a similar relaxation of the marginal problem: before asking whether a set of marginals is consistent, one could start by asking the (in general weaker) question of whether the induced set of marginal entropies is consistent. Interestingly, one can apply this idea to the study of entropic versions of the Bell inequalities. Such study was apparently pioneered by Braunstein and Caves, who obtained the following “entropic CHSH inequality” :

$\displaystyle H(A_0B_0)+H(A_0B_1)+H(A_1B_0)-H(A_1B_1) \geq -2.$

(Note the mysterious mirror symmetry with the CHSH inequality itself… see the explanations in David Gross’s paper for explanations on its origin.) An advantage of this approach, explored in David’s paper in the setting of a different Bell inequality, is that it naturally allows one to introduce additional restrictions on the correlations between the random variables; he and co-authors consider e.g. the condition that they are generated using a certain limited amount of shared randomness.

Unfortunately David did not get to discussing the possibility of deriving entropic analogues of Tsirelson inequalities. Presumably these would still quantify the Shannon entropy of the outcome distribution of the result of applying arbitrary observables on an arbitrary state, with the additional restriction that some of the observables must be made on different subsystems. Or one could imagine inequalities involving the von Neumann entropy of the state itself, although I don’t see how the possibility for using different settings would be taken into account (and how a dependence on the dimension of the state would be avoided). Deriving such inequalities may not be an easy task… Indeed, I believe the marginal problem for quantum von Neumann entropies (finding a complete set of inequalities that characterize the cone of ${2^n}$-dimensional vectors of marginal entropies that are mutually consistent) is even less well understood than its classical counterpart.

I only discussed four of the many more talks accessible from the workshop’s online programme — go have a look! The third (and last) workshop taking place as part of the Newton semester will be a workshop on “Quantum Algorithms”, organized by Steve Brierley, Richard Jozsa and Andreas Winter in late November.

Posted in Conferences, Quantum, Talks | | 4 Comments

## The Newton Institute

From Charles to Cam

The Charles river

Over the summer I crossed the Atlantic, from new Cambridge to old Cambridge, where I am fortunate to be spending the Fall semester as a “visiting fellow” at the Isaac Newton Institute (INI). The INI “runs research programmes on selected themes in mathematics and the mathematical sciences”, and I am participating in the semester-long “Mathematical Challenges in Quantum Information” programme organized by Richard Jozsa, Noah Linden, Peter Shor and Andreas Winter. The INI typically runs two programmes simultaneously: when we started out, we shared the institute’s beautiful space with a group of researchers on the dynamics of infectious diseases; after that it was with physicists studying the holographic principle, soon to be replaced by participants in the “Mathematics for the fluid earth” programme — a wide range of programmes on mathematics and its applications.

River Cam

Both Cambridge are similar in at least two respects: first, much of their respective life and blood flows from from the large student body attending their universities, in both cases among the oldest in the nation (Oxford beats Cambridge by some measly 40 years, and Harvard’s status as the oldest in the US is only mildly disputed). Second, both cities are traversed by beautiful rivers (though the one much larger than the other) on which the aforementioned student bodies spend an inordinate amount of time engaged in the strangest nautical activities, in which they seem to place much pride. Punting, anyone?

In fact the demographics are remarkably similar: Cambridge, UK has about 19000 students, or ${23\%}$ of its total population; Cambridge, MA has roughly 35000 students, accounting for 28% of its size. In spite of being of roughly the same size, the English town feels much smaller. Paradoxically this may partly be due to it being more spread out, which is more conducive to a “small town” feeling; of course Cambridge, MA is surrounded by Boston and its suburbs, while Cambridge, UK is right in the middle of…well, not much… Although London is not far (45 minutes by train), this relative geographic isolation, combined with the particular structure of the university, does give the overall sensation of being in a much more secluded world than that of the Harvard and MIT of the other Cambridge. Students here spend much of their time inside the many gated colleges that are the heart of the university and which offer them room, board, tutoring, and more. Most of the colleges are pristine, with five-century old Gothic cathedral-like buildings surrounded by impeccably tended lawns. (Harvard can give that feeling a little as well, but it has much more modern-looking buildings than Cambridge University; also the population on campus, discounting the busloads of tourists that it shares with Cambridge University, seems much more diverse — though I don’t have any statistics to back this up).

The Newton Institute

View from my living room…

This is my first academic visit to Cambridge (my first time here was some 15 years ago already, for a summer job at a company situated barely a quarter of a mile from where the INI is housing me), and the INI is the perfect place to be spending it. Although it predates it by about a decade, the institute is located on the same site as Cambridge University’s Centre for Mathematical Sciences (CMS), home to the university’s Faculty of Mathematics. Apparently the INI was originally built “far” to the West of Cambridge, in a secluded location appropriate for the hermit life-style well-known to be the abode of great mathematicians. In fact, residents around the INI apparently secured the obligation for INI to turn off its lights after a certain time at night, so as not to pollute their own tranquility! When it was decided to build the larger CMS at the same location, some protested that this tranquility would forever be lost… Well, I am happy to report that, although Cambridge University’s West Cambridge site is fast expanding, tranquility is never too far (who said too close?): see the picture for a view from my apartment, a ten minutes walk from INI. In fact, Cambridge’s many footpaths, which crisscross the city as well as the surrounding countryside and villages, are a delight to explore and get lost in — a lot of morning runs in perspective!

The INI mezzanine

The institute is a small and cozy place. It is built on two floors. The ground floor hosts the main seminar room, which can accommodate about 100 people, the small library, the registration desk and a few couches and blackboards. The first floor, which opens up on the ground floor around the revolving staircase that leads to it (this is important so that the sound of the bell announcing the beginning of talks is heard throughout), contains the offices — two dozen or so — arranged around the all-important coffee machine, as well as abundant couches and blackboards. The space has just the right combination of chaos and order: chaos, in that there are many acute angles, half-levels and backboards in the strangest places, but also order, in that the coffee machine and supply of chocolate and other delicate biscuits never runs out. I would estimate that our programme has about 20 participants present at any given time, which makes for a lively and cohesive crowd. Morning coffee it served at 11, lunch is taken at the nearby colleges — the debate rages on as to which is better — worst? — and afternoon tea is ready at 3pm. People come and go but the blackboards never waste much time empty of the hopeful scribbles left by the enthusiastic crowd of computer scientists, mathematicians, and physicists alike.

Workshops

Couches on the first floor of the INI (coffee machine hidden to the right)

This semester’s programme is structured around three week-long workshops. The first, on “New Mathematical Directions for Quantum Information” and organized by Ashley Montanaro, Noah Linden and Andreas Winter, took place in early September. The workshop was successful in introducing the participants to the themes and mathematical tools around which the semester is structured, including the application of techniques from information theory, functional analysis, random matrix theory, complexity theory and more to problems in quantum information and quantum computing such as the study of quantum channel capacities, quantum algorithms, quantum multiplayer games. All talks at the INI are recorded, and you can view those from the workshop by navigating through the workshop’s online programme. For those not yet familiar with the topics I recommend Graeme Mitchinson’s introduction to de Finetti theorems, Fernando Brandao’s two lectures on information-theoretic techniques in quantum many-body physics, Ion Nechita’s lectures on random matrix theory and Stanislaw Szarek’s tutorial on measure concentration. (Incidentally, if you complement those with Jon Keating’s lecture on random matrix theory, you get quite a good introduction to the distinctive British, French and Brazilian styles of teaching!)

The second workshop, soberly (jokingly?) entitled “Quantum Marginals” and organized by Matthias Christandl, Michael Walter and Andreas Winter, is running this week. Although the workshop is still ongoing (and I will unfortunately have to miss the last day), I found it quite instructive; although I was familiar with the basic statement of the quantum marginal problem I hadn’t appreciated how rich the problem was, and this week’s many talks gave a good sample of the many different angles under which it can be approached, using a wide array of techniques. So, what are quantum marginals about? Wait till the next post…(You can already watch the nice introductory talk by Michael Walter for a teaser.)

## Quantum PCP: a survey

On April 19th I was on a skype call with Dorit Aharonov (yes, Skype remembers the date :-)), one of the pioneers of quantum complexity theory, and even more so of the research around the local Hamiltonian problem and the class QMA. The call was meant to be about a workshop we are organizing together with John Watrous, but towards the end of the call Dorit switched topics and told me she had just been reminded of a short survey she had agreed to write for the SIGACT complexity column hosted by Lane A. Hemaspaandra. Dorit’s column was meant to be a survey of recent developments around the quantum PCP conjecture, addressed to a wide audience from computer science (SIGACT is ACM‘s

Dorit delivering a TEDx talk

“Special Interest Group on Algorithms and Computation Theory”). Although the column was originally due a couple days later, Lane had agreed to extend the deadline by about two weeks. Would I be interested in co-authoring the survey with Dorit (as well as withItai Arad, who was also part of the effort)?

My parents were about to visit for a week, and I was right in the mist of planning second visits for my job applications, but I thought, why not — I’ll learn the topic, and writing a survey might just be the right kind of occupation to keep my mind going during these incessant travels. In any case, the deadline was at most a couple weeks away: worst comes to worst, two weeks from now, we’re done.

Well, I’m happy to report that, a mere 5 months after our initial phone call, the survey is finished and has just been posted to the arXiv! (To be fair, the guest column in SIGACT was actually submitted — in somewhat of a frenzy — much earlier than that; we sent the final version on May 14th after Lane and the SIGACT editors had graciously accepted a couple extra weeks delay.) If there is any lesson to learn from the experience it would probably be that deadlines are good, but they must be hard deadlines. A “semi-hard” deadline, one that you feel you should really respect, but could nevertheless probably extend at the last minute by being sufficiently persuasive, does not help: it means you are constantly rushing to meet the next instantiation of the fleeting deadline without ever reaching the relief of getting the task done. Of course, in our case the delay between submission to SIGACT and posting on the arXiv is due to not having any deadline anymore, which means other tasks took priority for each of us. Searching through my email I can find one of us encouragingly stating that “we’ll be done by the end of the week”, or something similar, every couple weeks over the past three months!

They surely knew how to set a feast back then

This being said, I am quite happy with the result we arrived at. Although the survey is far from comprehensive (we did originally have a ${20}$ page limit, which we of course copiously exploded by now), I dare hope that it provides a reasonable (very) high-level introduction to the feast of quantum PCP delicacies that the courageous reader will treat herself to by letting the survey lead her through the five-course menu, with many references provided in guise of mignardises in case she has any space left. Although personally I still have a lot on my plate (and space in my stomach), I already reaped big benefits just from trying to catch up with Dorit and Itai’s much deeper understanding of the topic than mine (and indeed, being forced to do so was my original motivation for joining the writing effort). Both put a huge effort into putting together a comprehensive picture from the many recent, and often quite subtle, results touching on the conjecture. I hope that the story-line we eventually put together will be as enlightening to other computer scientists (to whom the survey is primarily addressed) as it was to me.

Before I release the starving hordes, I can briefly point to a couple aspects of the survey that I found particularly enlightening:

1. Multipartite entanglement

In the survey we spend a good deal of effort to communicate some of the beauty and magic of multipartite entanglement. (To be fair, I should really say, Dorit and Itai spent a good deal of effort…here I was definitely the learner!) It is only by first getting a flavor of the many counter-intuitive consequences of entanglement that one can really appreciate the subtleties behind the quantum PCP conjecture. A simple, but already striking, example is given by the so-called CAT states. These are two very simple ${k}$-qubit states,

$\displaystyle |CAT_{\pm}\rangle = \frac{1}{\sqrt{2}}\big( |0\cdots 0 \rangle \pm |1\cdots 1\rangle\big),$

that are orthogonal (hence perfectly distinguishable), but such that their reduced density on any subset of at most ${(k-1)}$ qubits are exactly identical. The only possible way to distinguish the two states is by having access to absolutely all qubits. No such phenomenon exists classically — two distinct strings must differ in at least one coordinate.

The toric code. Kitaev’s toric code is a particularly beautiful example of a family of four quantum states, defined on a ${k\times k}$ grid of qubits (qubits are on the edges), which exemplifies many more features of multipartite entanglement. One reason why this “code” is so beautiful is that its properties are related to the topology of the torus, with which the grid is identified by matching opposite borders. For instance, any “string” of Pauli ${X}$ operators which does not wrap around the torus leaves any of the four code states invariant — “indeed”, such strings are homotopic to the null string. However, the “group” of size ${4}$ comprised of strings wrapping around the torus along one or the other possible direction maps these states to one another.

Note how many quote marks I have used in the preceding paragraph; amazingly though all these statements can be made perfectly clear and rigorous. For me, working out some of the details came as a revelation — although I had frequently heard of the toric code, I had never actually kneaded it with my own hands… I’ll refer you to Section 2 of the survey for more details (incidentally, while working on that section I looked for some good lecture notes on the toric code: surely, such notes must exist, but I couldn’t find any that that were destined to the neophyte; any suggestions?).

The NLTS conjecture. The No Low-energy Trivial States (NLTS) conjecture, recently introduced by Hastings, is a weaker conjecture than the quantum PCP; it attempts to extract the physical intuition of why the latter is hard. Very roughly, the NLTS states that there exists Hamiltonians whose every low-energy state is “non-trivial”. The precise definition of non-trivial is a bit subtle, but intuitively it means that the state cannot be generated, or approximated, by a low-depth circuit; “non-trivial” gives a computational definition of a highly entangled state. It is clear that the quantum PCP conjecture implies the NLTS conjecture (any trivial low-energy state can serve as a classical witness for the energy of the Hamiltonian), but the converse is unknown, as the NLTS does not make any claims about the computational difficulty of approximating the ground energy of its Hamiltonians. Freedman and Hastings have proposed intricate constructions of Hamiltonians coming close to satisfying the NLTS condition, but the conjecture still stands and looks like a very good (though by no means easy) target for further research on the quantum PCP. In Section 4 of the survey we also discuss objections to the conjecture coming from the study of commuting Hamiltonians.

2. An exponential size PCP for QMA

In Section 5 of the survey we exhibit an explicit “exponential-sized PCP for QMA”. What does this mean? Take an instance ${H}$ of the ${k}$-local Hamiltonian problem, as described here. Suppose a classical verifier sets out to distinguish the two cases, ${\varepsilon_0(H)\leq a}$ or ${\varepsilon_0(H)\geq b}$, where ${\varepsilon_0}$ denotes the ground energy (the smallest eigenvalue) and ${a,b}$ are separated by an inverse polynomial in the number of qubits. To help him in this task, the verifier may ask for access to a classical proof of the statement “${\varepsilon_0(H)\leq a}$”. For instance, he could ask for a classical description of the ground state ${|\psi_0\rangle}$. A possible naïve description would be the list of all complex amplitudes of ${|\psi_0\rangle}$ in the standard basis. Unfortunately, not only will this description require exponentially many bits to be written down, but verifying it also requires exponentially many bits to be read — requiring much more verification time than our verifier cares to spend on the problem. Indeed, this “naïve” representation does not respect the locality of the problem: evaluating the energy of even a single ${k}$-local term ${H_i}$ requires taking a partial trace over the remainder of the system, an exponentially long calculation. Can we do better?

A first attempt to improve the above would be to require the proof to contain pre-computed descriptions of all possible reduced density matrices of ${|\psi_0\rangle}$, on any subset of ${k}$ qubits. This only requires polynomially many bits…which should make us suspicious: it would put the problem in NP! The catch is that, even though the energy of any local term may now be estimated efficiently, we still need to check that the reduced densities given in the proof are consistent with a single global state — itself a QMA-complete problem! We have come full circle.

Nevertheless, as we recount in the survey it follows from well-known arguments that to the least an efficiently verifiable exponential size PCP — an exponential size proof that can be checked by reading only a constant number of bits — for the local Hamiltonian problem must exist; to state it briefly this comes as a consequence of the well-known chain of inclusions QMA ${\subseteq }$ PSPACE ${=}$ IP ${\subseteq}$ PCP${(\text{poly}(n),O(1))}$. Unwinding the arguments behind these inclusions, however, requires going through a long series of intricate reductions, and it is not immediately clear what the resulting proof looks like.

In the survey we show that one can give a very intuitive PCP; although the resulting proof and verification procedure build heavily on the classical tools, such as the sum-check protocol, they still manage to keep the “quantum” structure of the problem on the surface, making crucial use of its “qubit-by-qubit” locality. Details of the construction are given in the survey, but before going there I invite you to find your own; indeed I should be many ways to obtain such a PCP, some of them more elegant than ours. Now for the major open question: can you make the PCP any shorter by allowing verifier and proof to be quantum (of course still limiting the number of queries to a constant, and requiring completeness and soundness of the PCP to be separated by a constant as well)?

## TCS+: Season 2

Last semester saw a fairly successful run of TCS+, the online seminar series that Anindya De, Oded Regev and myself co-organized over Google+ hangouts. We’re very pleased that all nine seminars were very well attended. The seven “inside spots” almost always filled up; our most loyal supporters included groups from University of Michigan, ETH Zurich, NYU, Columbia, Berkeley, TTI & University of Chicago. In addition we typically had around 30 viewers connected to the live stream at any point throughout the talk (though there’s no telling whether this number was generated by a constant turn-around, or if people tended to stick around for the whole talk — hopefully the latter!). The TCS+ YouTube channel registered 3,222 page views and 28,269 minutes of video viewed (including 1129 minutes, or about 19 hours, during the month of August, which I’ll interpret as proof that archiving videos of past talks *is* a valuable service — there are an increasing number of such archival services available on the web; see here for a partial list). All in all that’s about 471 hour-long talks, a healthy amount of attention shared between our nine excellent speakers, whom I wish to thank once more for their patience and effort in making this work: Ronald de Wolf (whose inaugurating talk, pictured above, remains the most viewed on our YouTube channel), Anup Rao, Raghu Meka, Nisheeth Vishnoi, Greg Valiant, Alexander Sherstov, C. Seshadri, Thomas Rothvoss, and Jelani Nelson. Feedback on the TCS+ experiment has so far been overwhelmingly positive, and I thank all our viewers for their support!

You’ll have noticed that both our past choice of speakers, and the groups that have been joining us live, are quite US-centric. This is definitely not deliberate, but rather an unfortunate consequence of the way we organized the talks: all three of us were at US institutions, meaning we’d more easily have contacts with US-based speakers; in addition scheduling the talks at a time when people on the West coast would (more or less) be awake meant it would already be early evening in Western Europe, and nighttime in Asia. I view this as unfortunate because the issue of distance between speaker and audience is is exactly what TCS+ is meant to overcome. We’d like to have talks by US speakers that are widely viewed by researchers at European or Asian institutions, and vice-versa. There is clearly demand for this: guess which country has been contributing, by far, the second largest number of viewers of TCS+ seminars: India! They are a fairly anonymous crowd, watching talks scheduled at 10 or 11pm from the comfort of their dorm room. But they are precisely the viewers we’d like to be reaching out to. So we’ll hopefully try for a more balanced schedule this semester. This will in part be achieved by shifting the organizing team’s center of mass to the East. As a first step towards this, I transferred my own moderate mass from “new” to “old” Cambridge. More importantly, we are very pleased to be joined by Thomas Holenstein from ETH Zurich in the organization of TCS+. This should let us organize some talks at a more reasonable time for our Asian viewers, and we of course also welcome suggestions of speakers from European or Asian institutions!

The TCS+ seminars have not been without a few hiccups, including a scary fire alarm at MIT, forcing me to abandon my laptop (which was hosting Ronald de Wolf’s talk) in an empty room at MIT while we were evacuated (you can still see the flashing lights in the online recording), and a network outage at MSR New England just 10 minutes before Greg Valiant’s talk, prompting him to dash for the nearby Stata center in order to recover connectivity (luckily the outage was short-lived and Greg could deliver his talk safely from MSR). All in all though, the “hangout” feature of Google+ has served us well, proving both remarkably stable (0 crashes) and easily accessible. Some groups did sometimes have trouble joining, but except maybe in one or two occasions everyone who tried hard enough eventually made it relatively unscathed.

There remain two main unsatisfactory aspects of the experience. First, the video quality is not great. The hangouts force a particular layout in which video feeds from online viewers take up a good chunk of the screen. In addition, it often proved difficult to setup the presenter’s computer so as to efficiently juxtapose his slides and a video feed of himself (and I certainly wish I could add “or herself” here: we’re definitely behind, not only in geographical balance, but also in gender balance…let’s work on this!). Second, the hard limit of 10 online viewers, though not absolutely crippling thanks to the online feed, does result in “outside” viewers feeling not as included in the talk as they should. This maybe partially explains why we did not have as many questions, and interaction, as we were hoping for. (Any suggestions to improve this?) While the “online” groups regularly asked questions, very few “outside” viewers used the comments feature of G+ to ask questions. Indeed, the delay in typing up the question, together with the necessity for a moderator to spot the comment, read it, (optional) understand it, and interrupt the speaker to ask it, made it a fairly useless feature.

So we explored around a little bit for options. The best alternative so far, suggested by Thomas Holenstein, has been a videoconferencing software from Radvision called Scopia. Scopia allows any number of simultaneous participants. It also makes it easy to show both a video feed from the presenter and his slides. Unfortunately so far we are not fully satisfied with the video, and even more sound, quality that we’ve experienced; this seems to be especially so when some of the participants have low bandwidth. Trying out alternatives made us appreciate how good the sound quality is in the hangouts, with very little interference or losses. Moreover, Google just announced that live streaming would now be available in HD, meaning that the quality for our outside viewers, and for the recorded video, will be much higher. The only remaining limitation is the 10-slot maximum. There seems to be a possibility to extend this to 15 slots, though we’re having trouble getting in touch with the right person at Google to help us obtain this extension (which we’d be happy to pay for). Anyone can help?

Another slight negative of using Google+ is that, well, not everyone uses it. This means that some of our talk announcements might go unnoticed; we’ve been trying to forward them to university mailing lists as well but it’s hard to keep track of everything. Though we’ll continue to use the TCS+ page as our main forum for the foreseeable future, we also created a more traditional mailing list to which we’ll send out announcements. This will be a low-volume list, with maybe 2 messages per week (an early announcement, and a reminder, for every talk). You can subscribe to the list now by clicking here. If there is any way announcements from the list can be forwarded automatically to the appropriate list at your institution, please let us know how! Our website also includes a Google calendar and RSS feeds that you can subscribe to.

I’ll conclude by announcing the first talk of this new season: we’ll have the pleasure of hosting Ramprasad Saptharishi, from MSR India. (Ramprasad agreed to talk at our usual 1pm EST, which will be 10:30pm for him, meaning we already wasted our first opportunity for shifting our schedule East; let’s do better next time!). Ramprasad’s talk is titled ”Arithmetic Circuits: Depth reductions, chasms and escalators”. He will tell us about a very strong threshold phenomenon about circuit lower bounds that him and co-authors have recently identified: while we do have  number of reasonable lower bounds for circuits of depth 2, strong enough lower bounds for circuits of depth just 3 would already imply lower bounds on general arithmetic circuits, a holy grail of complexity theory. Please join us for this exciting seminar!

## Leaving MIT and the Stata center

This blog has been left in a little bit of a hibernation mode these past few months (well…how to the Eskimos call it, when the duration of nighttime so much overtakes the duration of daytime?). I’ll spare you the usual excuses though — I suppose my mind was elsewhere. Which didn’t prevent me from generating ideas for future blog posts though, and by now I have a huge backlog that I hope to slowly go through. (Of course everyone knows where these kinds of hopes go…)

A couple weeks ago I permanently left my office in MIT’s building 32, the Ray & Maria Stata center (which not only houses CSAIL but also MIT’s linguistics and philosophy departments, including such celebrities as Noam Chomsky). The past couple weeks I was there, but I will soon be over       here, before going all the way over there and then spending some time here and here, before finally reaching there…lots of fun traveling in perspective, and hopefully good science as well!

I spent 18 months in Building 32, as a postdoc under the wise supervision of Scott Aaronson. The Stata center is a unique building with a unique history (and unique occupants of course!). In guise of farewell I went around the building and took a few pictures that I’ll share here with some impressions of the building.

Wikipedia has an interesting entry on Building 20, the building that occuppied the same space as the Stata center before it was built in the early 2000s. Building 20 was erected in just a few months in 1943, and meant to house research directly related to the ongoing war effort. Originally designed to be taken down 6 months after the war ended, it continued to be used until 1998…some 53 extra years! Due to its lack of official purpose, the building ended up housing all sorts of small, “speculative” projects. In spite of — thanks to — the difficult conditions, the building had an extremely successful run. The flexibility that comes with the almost total absence of design (and associated rule-keepers) enabled occupants (including, already, Chomsky) to transform their working space at will, bringing a wall down here and erecting another one there. The inscrutable floor plan encouraged unexpected collaboration as occupants from different areas were bound to get lost in the same dark corners. Difficult conditions (heat, leaking windows, etc.) encouraged creativity.

The architects behind Building 32, Frank Gehry and his firm, sought to keep *all* these elements in their design — the flexible spaces, the abtruse floor plan, as well as the leaky finishings. Its current occupants regularly complain about some of the conditions (who has never seen a mice trapped under their desk? Gotten lost? Received drops of water from a leaking window? Gotten lost? Been unable to fit furniture due to the weird angles? Have their lighting/AC/projector shut down automatically because “the building” said so? Gotten lost?), but overall it works remarkably well. The building is very bright, with numerous common spaces that open one onto the other, including one floor up or down. There are no long corridors, but rather small “work areas” the open one onto the other. Multiple office doors open up on any of these areas, encouraging spontaneous discussion.

Possibly the hardest, and most important, aspect to encourage in a building devoted to research is interaction. It should be the case that occupants are literally forced to pass by one another on a daily basis. This can be encouraged by placing strategic items such as the coffee machine, the elevators, or the bathrooms, in central and open locations. Building 32 does this; unfortunately it is possibly a little too big (and tall), so that there is a large proliferation of such areas and in the end smaller communities form around each of them. For instance, the theory group itself lives on two floors, with multiple distinct spaces. The group is large, and the space is needed, but it means that it easily lives onto itself as a small community. During my time there I had very little interaction with other large occupants of the same building — AI, optimization, etc — not to mention the linguists!

Second to interaction, a good research space should encourage creativity. I suspect the elements that work for one person or the other may vary wildly, and are harder to control for. For instance, in my case this includes a lot of light and open views (say, on the mountains), but others may need darkness to concentrate. The ability to focus also necessitates some amount of calm & quiet (though not necessarily perfect), a requirement not always easy to reconcile with interaction. Finally, I personally need what I would call a “diversity of working spaces”. I usually feel that I cannot work for more than a couple hours on the same problem at the same place. So, in addition to an office desk, I may also need a couch, a library, a cafe, a bench in a courtyard, etc.; the more the better!

All in all, building 32 provided me with all this and much more (the one downside being a little more noise than I would have asked for), and it is one of the most original and welcoming spaces I have had the chance to work in. I am very lucky to have many other great working environments lined up for the coming year: the Newton Institute in Cambridge, Berkeley’s Simons Institute‘s brand-new space in Calvin Hall, and finally the beautiful Annenberg Center on the Caltech campus!

Posted in Uncategorized