Hypercontractivity in quantum information theory

In this post I’d like to briefly mention some connections between log-Sobolev inequalities, and more specifically hypercontractivity, with quantum mechanics and quantum information. To anyone interested I recommend the video from Christopher King’s excellent overview talk at the BIRS workshop. The talk includes a nice high-level discussion of the origins of QFT (roughly 25 minutes in) that I won’t get into here.

Existence and uniqueness of ground states in quantum field theories. The very introduction of (classical!) hypercontractivity, before the name was even coined, was motivated by stability questions in quantum field theory. This approach apparently dates back to a paper by Nelson from 1966, “A quartic interaction in two dimensions” (not available online! If anyone has a copy…my source on this are these notes by Barry Simon; see section 4). Nelson’s motivation was to establish existence and uniqueness of the ground state of an interacting boson field theory that arose as the natural quantization of a certain popular classical field theory. Let {H} be the resulting Hamiltonian. {H} acts on the Hilbert space associated with the state space of a bosonic field with {n} degrees of freedom, the symmetric Fock space {\mathcal{F}_n}.

The main technical reason that make bosonic systems much easier to study than fermionic ones is that bosonic creation and annihilation operators commute. In the present context this commutativity manifests itself through the existence of an isomorphism, first pointed out by Bargmann, between {\mathcal{F}_n} and the space {L^2({\mathbb R}^n,\gamma^n)}, where {\gamma} is the Gaussian measure on {n}. The connection arises by mapping states in {\mathcal{F}_n} to functions on local observables that lie in the algebra generated by the bosonic elementary creation and annihilation operators. Since these operators commute (when they act on distinct particles), the space of such functions can be identified with the space of functions on the joint spectra of the operators; doing this properly (as Bagmann did) leads to {L^2({\mathbb R}^n,\gamma^n)}, with the creation and annihilation operators {c_i} and {c_i^*} mapped to {\frac{\partial}{\partial x_i}} and its conjugate {(2x_i-\frac{\partial}{\partial x_i}} respectively.

The simplest “free field” bosonic Hamiltonian is given by the number operator

\displaystyle H_0 \,=\, \sum_i c_i^* c_i, \ \ \ \ \ (1)

which under Bargmann’s isomorphism is mapped to {\sum_i 2x_i\frac{\partial}{\partial x_i} - \Delta}. Up to some mild scaling I’m not sure I understand, this is precisely the Liouvillian associated with the Ornstein-Uhlenbeck noise operator that we saw in the previous post!

Here is how hypercontractivity comes in. {H_0} is Hermitian, and using Bargmann’s isomorphism we can identify {e^{-t H_0}} with a positive operator on {L^2({\mathbb R}^n,\gamma^n)}. Performing this first step can already provide interesting results as, in the more general case of a Hamiltonian {H} acting on a field with infinitely any degrees of freedom, it lets one argue for the existence of a ground state of {H} by applying the (infinite-dimensional extension of the) Perron-Frobenius theorem to the positive operator {e^{-tH}}. But now suppose we already know e.g. {H\geq 0}, so that {H} does have a ground state (there is a lower bound on its spectrum), and we are interested in the existence of a ground state for {H+V}, where {V} is a small perturbation. This scenario arises for instance when building an interacting field theory as the perturbation of a non-interacting theory, the situation that Nelson was originally interested in. The question of existence of a ground state for {H+V} reduces to showing that the operator {e^{-t(H+V)}} is bounded in norm, for some {t>0}. Using {\|e^{A+B}\|\leq \|e^A e^B\|} it suffices to bound, for any {|\varphi\rangle},

\displaystyle \|e^{-tV}e^{-tH}|\varphi\rangle\|_2 \leq \|e^{-t V}\|_4 \|e^{-tH}|\varphi\rangle\|_4,

where the inequality follows from Hölder. Assuming {\|e^{-t V}\|_4} is bounded (let’s say this is how we measure “small perturbation” — in the particular case of the quantum field Nelson was interested in this is precisely the natural normalization on the perturbation), proving a bound on {\|e^{-t(H+V)}\|} reduces to showing that there is a constant {C} such that

\displaystyle \|e^{-tH}|\varphi\rangle\|_4\leq C \||\varphi\rangle\|_2 \ \ \ \ \ (2)

for any {|\varphi\rangle}, i.e. {e^{-tH}} is hypercontractive (or at least hyperbounded) in the {2\rightarrow 4} operator norm. This is precisely what Nelson did, for {H} equal to the number operator (1), i.e. he proved an estimate of the form~(2) for the Ornstein-Uhlenbeck process, later obtaining optimal bounds: hypercontractivity was born! (As Barry Simons recalls in his notes the term “hypercontractive” was only coined a little later, in a paper of his and Hoeg-Krohn.)

Fermionic Hamiltonians and non-commutative {L^p} spaces. This gives us motivation for studying hypercontractivity of operators on {L^2({\mathbb R}^n,\gamma^n)}: to establish stability estimates for the existence of ground states of bosonic Hamiltonians. But things start to get even more interesting when one considers fermionic Hamiltonians. Fermions are represented on the antisymmetric Fock space, and for that space there is no isomorphism similar to Bargmann’s. His isomorphism is made possible by thinking of states as functions acting on observables; if the observables commute we obtain a well-defined space of functions acting on the joint spectrum of the observables, leading to the identification with a space of functions on {{\mathbb R}^n}, where {n} is the number of degrees of freedom of the system.

But the fermionic creation and annihilation operators anti-commute: they can’t be simultaneously diagonalized. So states are naturally functions on the non-commutative algebra {\mathcal{A}} generated by these operators. Apparently Segal was the first to explore this path explicitly, and he used it as motivation to introduce non-commutative integration spaces {L^p(\mathcal{A})}. (As an aside, I find it interesting how the physics suggested the creation of such a beautiful mathematical theory. I used to believe that the opposite happened more often — first the mathematicians develop structures, then the physicists find uses for them — but I’m increasingly realizing that historically it’s quite the opposite that tends to happen, and this is especially true in all things quantum mechanical!) For the case of fermions the canonical anti-commutation relations make the algebra generated by the creation and annihilation operators into a Clifford algebra {\mathcal{C}}, and the question of existence and uniqueness of ground states now suggests us to explore the hypercontractivity properties of the associated semigroup (third example in the previous post) in {L^p(\mathcal{C})}. This approach was carried out in beautiful work of Gross; the hypercontractivity estimate used by Gross was generalized, with optimal bounds, by Carlen and Lieb. (I recommend the introduction to their paper for more details on how hypercontractivity comes in, and Gross’ paper for more details on the correspondance between the original fermionic Hamiltonian and the semigroup for which hyperocntractivity is needed.)

Quantum information theory. As Christopher King discussed in his talk at BIRS, aside from their use in quantum field theory non-commutative spaces and hypercontractivity are playing an increasingly important role in quantum information theory. This direction was pioneered in work of Kastoryano and Temme, who showed that, just as classical hypercontractivity and its connection with log-Sobolev inequalities has proven extremely beneficial for the fine study of convergence properties of Markov semi-groups in the “classical” commutative setting (cf. my previous post), hypercontractivity estimates and log-Sobolev inequalities for quantum channels could lead to much improved bounds (compared with estimates based on the spectral gap) on the mixing time of the associated semi-groups. In particular Kastoryano and Temme analyzed the depolarizing channel and obtained the exact value for the associated log-Sobolev constant, extending the classical results on the Bonami-Beckener noise operator that I described in the previous post. The depolarizing channel is already very important for applications, including the analysis of mixing time of dissipative systems or certain quantum algorithms such as the quantum metropolis algorithm.

For more applications and recent developments (including a few outliers…), see the workshop videos!

Posted in Conferences, Quantum | Tagged , , , | 7 Comments

Workshop on hypercontractivity and log-Sobolev inequalities in quantum information theory

A couple weeks ago I attended the workshop on “Hypercontractivity and Log Sobolev Inequalities in Quantum Information Theory” organized by Patrick Hayden, Christopher King, Ashley Montanaro and Mary Beth Ruskai at the Banff International Research Station (BIRS).

The pavilion in which workshop talks were held

BIRS holds 5-day workshops in mathematics and applications throughout the year, itself a small component of the Banff centre, which describes itself as the “largest arts and creativity incubator on the planet”. Sounds attractive? The centre appears to be remarkably successful, managing to attract not only the brightest mathematicians like us(!), but also a range of talented artists from the world all over, reputable international conferences, and more. The setting certainly doesn’t hurt (see the pictures — and, by the way, that view was from the center’s beautiful library!), nor does the Canadian hospitality — with hearty food provided daily in a cafeteria with impressive views on the surrounding mountains. It was my first visit and in spite of the somewhat coldish weather (“much warmer than usual”, as they say…you bet! When “usual” means -20 Celsius, it’s still not quite clear what improvement you’re getting). I enjoyed the place, which occupies some sort of in-between a seclusive retreat in the mountains (think Dagstuhl) and a bustling research centre (think the Simons Institute in Berkeley). Good balance.

Aerial view of the Banff center (in summer!)

The goal of this post is to give a brief introduction to the (first half of the) topic of the workshop, “Hypercontractivity and Log Sobolev Inequalities”. (I hope to talk a bit about the second half, “in Quantum Information theory”, in the next post.) Log-Sobolev inequalities, be they quantum or classical, meant next to nothing to me before attending — I had heard of both, and was somewhat familiar with the uses of hypercontractivity in Boolean function analysis, but log-Sobolev inequalities were much more mysterious. The workshop was successful in bringing different perspectives to bear on these inequalities, their relationship, and applications (videos available online!). For my own benefit I’ll write down some very basic facts on these inequalities, where they come from, and what is the motivation for studying them. What I write will undoubtedly appear extremely naive to even moderate experts, and for much more on the subject I highly recommend a survey by Diaconis and Saloff-Coste, the lecture notes by Guionnet and Zegarlinski, and the (quals-fueled) survey by Biswal. In particular I refer the reader to these resources for proofs of the unproven assertions made in this post (any errors introduced being mine of course), as well as appropriate credit and references for the results mentioned below.

Markov semi-groups. The fundamental objects underlying both the hypercontractive and log-Sobolev inequalities are a space of functions {\mathcal{F}} (think continuous real-valued functions defined on a nice topological space, {\mathbb{R}^d}, or your favorite graph) together with, first a Markov semi-group {(P_t)} defined on that space, and second, a measure {\mu} on the same space that is invariant under {P_t}, i.e.

\displaystyle \mu(P_t f)\,=\,\mu(f) \ \ \ \ \ (1)

for all {f\in\mathcal{F}}. Now, even the notion of a “Markov semi-group” can be intimidating (to me), but it really corresponds to the most simple process. A Markov semi-group is a family of maps {(P_t):\mathcal{F}\rightarrow\mathcal{F}} indexed by the positive reals {t\in\mathbb{R}_+} that transforms functions in a way that applying the map for time {t}, and then time {s}, is equivalent to applying it for time {t+s}: {P_{t+s}=P_tP_s}. Simple examples are e.g. any random walk on the domain of {\mathcal{F}}, with {P_t} denoting averaging over inputs obtained by taking a “{t}-step” walk, or a diffusion process on real functions, such as convolving with a Gaussian of variance {t} or evolving {f} under the heat equation for time {t}. Some concrete examples that I’ll refer to throughout:


Random walks on regular graphs. Let {G=(V,E)} be a regular undirected graph with degree {d}, and {A} its normalized adjacency matrix: {A_{i,j} = 1/d} if {\{i,j\}\in E} and {A_{i,j}=0} otherwise. Let {\mathcal{F} = {\mathbb R}^V}. If {f\in \mathcal{F}} and {v\in V}, then {(Gf)(v) = E_{u\sim v}[f(u)]}, where the expectation is taken over a random neighbor {u} of {v} in {G}. If {f} is a distribution over {V}, then after {t} steps of the discrete-time random walk on {G} the distribution is given by {G^t f}. The standard way to make this into a continuous-time random walk is to wait at a vertex {v} for an amount of time distributed as an exponential with mean {1}, and then move to a random neighbor. In other words, the number of steps taken in time {t} is distributed as a Poisson point process {N(t)} with rate {1}. Starting in an initial distribution {f}, the distribution after time {t} is given by

\displaystyle P_t f = \sum_{k=0}^\infty \Pr(N(t) = k) G^k f = \sum_{k=0}^\infty \frac{t^k}{k!} e^{-t}G^k f = e^{t(G-I)}f.

The same construction extends to any irreducible Markov chain {K(x,y)\geq 0}, {\sum_y K(x,y)=1}, for which the associated semigroup is {P_t f = e^{t(K-I)}f}.

Ornstein-Uhlenbeck process. Here we take {\mathcal{F}} the set of continuous functions from {\mathbb{R}} to itself, and

\displaystyle P_tf(x) = \int_{\mathbb{R}} f(e^{-t} x + \sqrt{1-e^{-2t}} y) \frac{e^{-y^2}/2}{\sqrt{2\pi}} dy.

This describes the evolution of a particle submitted to Brownian motion under the influence of friction. It is a continuous, Gaussian analogue of the Bonami-Beckner noise operator {T_\rho} on the hypercube that crops up in computer science applications, {T_\rho f(x) = E_{y\sim x} [f(y)]} where {y} is obtained from {x\in\{0,1\}^n} by flipping each of its coordinates independently with probability {p=1-\rho}. Time-independent evolution. Schrödinger’s equation dictates that a quantum state {\rho} suject to a fixed Hamiltonian {H} will evolve as {\rho\mapsto \rho(t) = e^{-iHt}\rho e^{iHt}} (where I set {\hbar =1} for simplicity). It is easy to check that {P_t:\rho\mapsto \rho(t)} is a Markov semi-group. Note however this semi-group acts on a different kind of space than the previous examples: {\rho} is an operator on a Hilbert space {\mathcal{H}}, rather than a function on a certain domain. Dealing with such examples requires non-commutative analogues of the classical theory to which I hope to return in the next post.

Ergodicity. Now that we have a cast of players, let’s see what kind of questions we’d like to answer about them. The most basic question is that of convergence: given a Markov semi-group {(P_t)} defined on a function space {\mathcal{F}}, does the sequence {(P_t f)} converge, and if so at what rate? The latter part of this question requires one to specify a distance measure on {\mathcal{F}}. For this we will use the {L^p} norms, defined as {\|f\|_p = (\int |f|^p d\mu)^{1/p}} for some measure {\mu} on the domain of functions in {\mathcal{F}}. Important special cases are {p=1,p=2} and {p=\infty}. Here the measure {\mu} will always be an invariant measure for {P_t}, i.e. it satisfies (1). There does not always exist an invariant measure, and there can be more than one; in practice though {(P_t)} and {\mu} are often defined together and for the purposes of this post I will assume there is a unique invariant {\mu} (for the example of a semi-group defined from a Markov chain, this corresponds to the chain being irreducible, and {\mu} is its unique stationary measure). The question then becomes, does {(P_t f)} converge towards its expectation {\mu(P_t f) = \mu(f)} (by invariance), and if so at what speed? This is precisely the question that log-Sobolev inequalities will let us answer.

One says that {(P_t f)} is {p}-ergodic if {\int |P_t f-\mu(f)|^p d\mu \rightarrow_{t\rightarrow\infty} 0} for all {f\in\mathcal{F}}, i.e. convergence holds in the {p}-norm. Note this is equivalent to requiring {\|P_t-\mu\|_{p,\infty} \rightarrow_{t\rightarrow \infty} 0}, where {\|\cdot\|_{p,q}} denotes the operator norm from {L^p} to {L^q}. Since {\|f\|_p \leq \|f\|_{p'}} for all {f} and any {p\leq p'}, the notion of {p}-ergodicity becomes stronger as {p} increases. We will see three different ways to obtain bounds on the rate of convergence: the spectral gap inequality, equivalent to {2}-ergodicity; Sobolev inequalities and ultra-contractivity, which imply {\infty}-ergodicity, and log-Sobolev inequalities and hypercontractivity, which are related to {p}-ergodicity for intermediate {p}.

Before getting started there is one last quantity associated with any Markov semi-group that we need to introduce, the Liouvillian {\mathcal{L}}. The Liouvillian is defined through its action on functions in {\mathcal{F}} as

\displaystyle \mathcal{L}f = \lim_{t\rightarrow 0^+}\, \frac{P_t-Id}{t}f.

Equivalently, the Liouvillian can be defined through the functional equation {\partial_t(P_t f) = \mathcal{L} (P_t f)}: it plays the role of infinitesimal generator for the semi-group {(P_t)}. For the example of the random walk the Liouvillian is simply {\mathcal{L} = G-I}. For the Ornstein-Uhlenbeck process one finds {\mathcal{L}(f)(x)=f''(x)-xf'(x)}, and for the Bonami-Beckner noise operator {\mathcal{L} = (1-\rho) (H-I)} where {H} is the adjacency matrix of the hypercube. Finally in the example of (time-independent) Schrödinger evolution we have {\partial_t(\rho(t)) = i [\rho(t),H]}, so that the Liouvillian is {\mathcal{L}(\rho) = -i[H,\rho]}.

Spectral gap inequality and {L^2}-ergodicity. A first family of inequalities are spectral gap inequalities, which turn out to be equivalent to {L^2}-ergodicity. The spectral gap of an invariant measure {\mu} for the semigroup generated by {\mathcal{L}} is defined as the largest {\lambda>0} (if it exists) such that

\displaystyle \mu\big(f (-\mathcal{L}) f\big)\,\geq\,\lambda\, \mu(f^2) \ \ \ \ \ (2)

holds for all {f} such that {\mu(f)=0}. In the example of the Markov chain, {\mathcal{L} = K-I} and the inequality can be rewritten as {E_{x\sim y}[ |f(x)-f(y)|^2] \geq \lambda E_{x,y}[|f(x)-f(y)|^2]}, where the first expectation is taken over {x\sim \mu} and {y\sim K(x,\cdot)}, and the second expectation is over independent {x,y\sim \mu}. In particular for random walks on graphs we recover the standard definition of the Laplacian {L=-\mathcal{L}}.

Applying (2) to {P_t f} we see that it is equivalent to requiring {\partial_t \mu((P_tf)^2) \leq -2\lambda \mu((P_tf)^2)}, which using {P_0=Id} implies the bound {\mu((P_t f)^2) \leq e^{-2\lambda t} \mu(f^2)} for all {f} such that {\mu(f)=0}. Applying the appropriate translation the same inequality can be rewritten as

\displaystyle \mu((P_t f - \mu(f))^2) \leq e^{-2\lambda t} \mu((f-\mu(f))^2),

which is precisely a {L^2} convergence bound. Conversely, by differentiating the above inequality we recover (2), so that the two are equivalent.

Rather than {L^2} convergence one would often like to derive bounds on convergence in {L^1}, or total variation, distance. One way to do this is to use Hölder’s inequality as

\displaystyle \begin{array}{rcl} \|P_t-\mu\|_{1,\infty}^2&=&\sup_f \|P_t f - \mu(f)\|_1^2 \\ &\leq& \Big(\sup_f \frac{1}{\mu(f)}\Big)\sup_f \mu((P_t f - \mu(f))^2)\\ &\leq& \Big(\sup_f \frac{1}{\mu(f)}\Big)e^{-2\lambda t}. \end{array}

Thus we get an exponentially decaying bound, but the prefactor could be very large; for instance in the case of a random walk on a graph it would typically equal the number of vertices. Our goal is to do better by using log-Sobolev inequalities. But before investigating those, let’s first see what a Sobolev (without the log!) inequality is.

Sobolev inequalities and {L^\infty}-ergodicity. Sobolev inequalities are the “high-{p}” generalizations of the spectral gap inequality. {\mu} satisfies a Sobolev inequality if there is a {p>2} and constants {a>0}, {b\geq 0} such that for all {f} with {\mu(f)=0},

\displaystyle \|f\|_p^2 \leq a \mu(f (-\mathcal{L}) f) + b \mu(f^2).

Sobolev inequalities imply the following form of ergodicity:

\displaystyle \|P_t f - \mu(f)\|_\infty \leq c\big(\frac{a}{2et}+b\big)^\gamma \|f-\mu(f)\|_1,

where {c,\gamma} are constants depending on {p} only (and {c\rightarrow \infty} as {p\rightarrow 2}). If one can take {b=0} (which is often the case) directly gives a polynomial bound on {\|P_t - \mu\|_{1,\infty}} which is incomparable to the one we derived from the spectral gap inequality: on the one hand it does not have the large prefactor, but on the other it only gives polynomial decay in {t}. This form of contractivity, in the {1\rightarrow\infty} operator norm, is called “ultracontractivity”.

An important drawback of Sobolev inequalities is that they do not “tensor” well, leading to dimension-dependent bounds. What this means is that if {P_t} acts on functions of a single variable in a certain way, and we let {P_t^{\otimes n}} act on {n}-variable functions in the natural way, then the resulting semigroup will in general not satisfy a Sobolev inequality with constants {a,b} independent of {n}, even if {P_t} itself did. This contrasts with the spectral gap inequality: it is not hard to see that the spectral gap of {P_t^{\otimes n}} is the same as that of {P_t}. This makes Sobolev inequalities particularly hard to derive in the infinite-dimensional settings that originally motivated the introduction of log-Sobolev inequalities (I hope to return to this in the next post).

Log-Sobolev inequalities and {L^p}-ergodicity. We finally arrive at log-Sobolev inequalities. We’ll say that {\mu} satisfies a log-Sobolev inequality if there exists constants {c>0}, {d\geq 0} such that for all non-negative {f},

\displaystyle \mu\Big(f^2 \log \frac{f^2}{\|f\|_2^2}\Big) \leq c\mu(f (-\mathcal{L}) f) + d \mu(f^2). \ \ \ \ \ (3)

Often it is required that {d=0}, and the log-Sobolev constant is {\alpha=1/c} where {c} is the smallest positive real for which (3) holds for all {f}. A log-Sobolev inequality can be understood as a “weakest” Sobolev inequality, obtained as {p\rightarrow 2}; indeed, one can show that if {\mu} satisfies a Sobolev inequality then it also satisfies a log-Sobolev inequality. Gross related the log-Sobolev inequality to hypercontractivity by showing that it is equivalent to the statement that for all {t>0} and all {2\leq q \leq 1 + e^{4\alpha t} } the bound

\displaystyle \|P_t-\mu\|_{2,q} \leq 1

holds: {(P_t)} is hypercontractive for the {2\rightarrow q} operator norm if and only if the log-Sobolev constant satisfies {\alpha \geq \ln(q-1/(4t))}. One can also show that {\alpha \leq \lambda/2} always holds. But a direct use of the log-Sobolev inequality can give us a better convergence bound than the one we derived from the spectral gap: a more careful use of Hölder’s inequality than the one we made earlier easily leads to the bound

\displaystyle \|P_t-\mu\|_{1,\infty}^2\,\leq\, 2\log\Big(\sup_f \frac{1}{\mu(f)}\Big) e^{-2\alpha t},

a marked improvement over the prefactor in the bound derived from {\lambda} alone (provided {\alpha} and {\lambda} are of the same order, which turns out to often be the case).

Aside from this improvement a key point making log-Sobolev inequalities so useful is that, contrary to Sobolev inequalities, they tensor perfectly: if {\mu_i} satisfies a log-Sobolev inequality with coefficients {(c_i,d_i)} for {i=1,2} then {\mu_1 \otimes \mu_2} satisfies one with coefficients {c=\max(c_1,c_2)} and {d=d_1+d_2}. Thus for example the standard approach for proving hypercontractivity for the Bonami noise operator on the {n}-dimensional hypercube: first, prove the “two-point inequality” for the single-variable case, {\|T_\rho\|_{p,q} \leq 1} for any {\rho \leq \sqrt{(p-1)/(q-1)}}. This by itself is a nontrivial statement, but it boils down to a direct calculation. Then use Gross’ equivalence to deduce that {T_\rho} satisfies a log-Sobolev inequality with the appropriate constant {\alpha}. Deduce that the {n}-dimensional noise operator satisfies a log-Sobolev inequality with the same value of {\alpha}, and conclude hypercontractivity of {T_\rho} for general {n}: {\|T_\rho\|_{2,q}\leq 1} for all {2\leq q \leq 1+\rho^{-2}}.

I hope this got you interested in the topic. I wish I had thought of browsing through the surveys mentioned at the start of the post before the BIRS workshop…Have a look, they’re quite interesting. Next time I’ll try to recall why the second part of the workshop title, “and quantum information theory”…

Posted in Conferences, Talks | Tagged , , | 2 Comments

Where on earth?

Tough life.


(Bonus points for getting the precise location where the picture was taken from.)

More scientific content to come…

Posted in Where on earth | 5 Comments

ITCS 2015

It might’ve been there for a while (it definitely should :-)), but I only just noticed: the list of papers accepted to ITCS 2015 is out! There were 45 accepts out of 159 submissions, a record rate for ITCS and a level of selectivity, 28%, comparable to that of STOC or FOCS (which typically accept slightly under twice as many papers, and have less than twice as many submissions). I had the pleasure and honor to serve on this year’s program committee (how else do you think I’d get a paper in?), and I was truly impressed by the overall quality of submissions. We could easily have accepted twice as many papers and still claim a spot for ITCS as a top-tier conference in the theory of computation. If so, why didn’t we? Giving an answer requires to think through which purpose we want ITCS, and more generally our constellation of theory (or theory-friendly) conferences, to serve. The question is being increasingly actively debated, though it has already been on the table for a while (and even featured on this blog…); I think it is an important one, and I’d like to share some thoughts.

STOC and FOCS have been giving rhythm to any decent theorist’s academic year for immemorial times (well, almost — as Dick Lipton reminded us in his Knuth award prize at FOCS, some of us do remember the time when the first editions of theory conferences were making their apparition). Their respective Spring and Fall deadlines are so much entrenched that I was recently asked by a graduate student how one could come up with a new 6-month project every 6 months, and make sure to wrap it up by the next deadline? Of course in practice most people work on many projects in parallel, and decide which have a decent chance of being brought to a (reasonable?) state of completion by deadline time a few weeks (hours?) before the deadline. Nevertheless, these “flagship” conferences are there to constantly remind us of our bi-yearly sacrificial duty. The longer one has been going through the process, the harder it is to imagine any changes to the model: Wouldn’t any reduction in the rhythm affect our productivity and the speed at which ideas spread in the community? To the opposite, wouldn’t any increase in rhythm lead to a serious risk of exertion? Isn’t the actual system just right?

Of course not — indeed, I don’t know anyone who’d claim the current model is good. But I don’t know any two researchers who’d agree on how to change it either. Any substantial proposal seems to immediately run into a large amount of resistance — in the form of a barrage of “what about…?”, or “what if…?” — that it is clear any change will have to be extremely gradual. The difficulty is exacerbated by the additional level of inertia imposed by the professional societies, ACM and IEEE, that administer STOC and FOCS respectively. In this respect the recent proposal by Boaz Barak and Omer Reingold had good chances behind it: Boaz and Omer put forward a very concrete proposal that carefully avoided any changes to the backbone structure of STOC and FOCS, preserving submission deadlines, independent committee evaluation, rough numbers of accepted papers, etc., and instead focused on the format of the actual event, in substance proposing a single yearly week-long “theoryfest” to replace the current bi-yearly 3-day gatherings. Unfortunately, from the echoes I’ve had from Saturday’s meeting at FOCS it looks like even this moderate change  is proving too much for the community to take, and the background grumble is likely to overtake the energetic spike proposed by Boaz and Omer.

ITCS, in contrast, should be largely free of such accumulated inertia. The conference series is still in its blooming youth: the coming edition, to be held in January at the Weizmann institute in Israel, will be the 6th instantiation of the series. It is not attached to any professional society (though it does seem to be loosely affiliated with ACM’s SIGACT), and is instead run by a steering committee made of prominent members of the theory community. Since its inception ITCS has been in permanent soul-searching mode: is there space for yet another theory conference? Wouldn’t the debate over a possible merge of STOC and FOCS indicate quite the opposite, that we theorists already have too many venues in which to express ourselves?

I would argue that there is, and that searching for this space can play an important role in defining the future of theory. Irrespective of changes in the format of STOC or FOCS, ITCS can continue to occupy a special niche: the “conceptual” niche (I don’t like the word — it is too overloaded — but I can’t think of a better one). ITCS is not there to wipe off the excess production that didn’t make it to its big sisters. It is different, not only in its selection process but also, and just as importantly, in the format of the event itself. It should affirm this difference by continuing to build the reputation it has started to earn as a small, friendly, innovative conference in which discussions take a prominent part, the program is light and single-track, and the presentations lead to lively discussions between the participants. Not everyone needs to go to ITCS every year — but, given the chance, most would go, and have a pleasant experience. You might not get to learn about all the recent hot results: theoryfest is made for that. But you’ll be exposed to the pulse of the community, emerging research directions you might not have been exposed to otherwise (for instance, because you went to the other parallel session, the one with talks on your favorite area…).

There is a broader picture to keep in mind with respect to the place of theory within the CS landscape at large. By its nature our field is bound to have its relevance regularly questioned: are we looking at the “right” problems? And even if we are, are the kind of answers we bring relevant? Or has theory lost any contact with applications? ITCS could play an important role in addressing these questions. As a “conceptual” conference it is there precisely to bring to light promising avenues and novel directions in which our theoretical ideas and concepts may find stimulation and application. Vitality of the conference will be a sign that theory is able to permanently re-invent itself. Many of us strongly believe that the language we speak is universal; the question then is only whether we’re able to use that language effectively and bring it to bear on the problems that our peers across the sciences care about.

If I am to judge of ITCS’s vitality by the quality of this year’s submissions I have no reason to be worried about the future of theory. As already mentioned, the crop was excellent, and decisions were difficult. I was also impressed at how seriously the “mission” of ITCS was taken into account during the reviewing process. The PC chair, Tim Roughgarden, put a strong emphasis on selecting papers that fit the single sentence in the ITCS call for papers that seeks to distinguish it from other conferences:

ITCS (previously known as ICS) seeks to promote research that carries a strong conceptual message (e.g., introducing a new concept or model, opening a new line of inquiry within traditional or cross-interdisciplinary areas, or introducing new techniques or new applications of known techniques).

This is not to say that strong submissions should be rejected if their “conceptual message” was not apparent. This would be a mistake, for who are we to judge of the presence, and even more so future importance, of a new concept? Only time will tell. But given the large number of strong submissions, difficult decisions had to be made, and in many cases the discussion did seriously take into account what we perceived as the conceptual novelty, or appeal, of a submission. The debate was often extensive; I can assure you that most submissions received very careful attention.  I wish the authors, or even the community at large, could get precise feedback on the discussions, which had the benefit of raising awareness, at least among PC members, of the above quote from the CFP.

We also had some discussions on the format of the conference; because these discussions happened after the formal publication of the CFP I’m not sure we can expect any big changes to the format this year, but I hope that they may lead to improvements in later years. Some of the suggestions that were floated around included having invited talks, varying the length of contributed talks, and having a poster session, possibly preceded by short 3-4 minute “advertisements” of the posters by their authors. Personally I think this is an excellent idea, and I hope it gets implemented in a later edition.

To get back to my opening comment — why did we accept so few papers, if the submissions were so good? Well, if you’re angry your paper got rejected, I can only promise that the accepted ones were at least as good — come and check them out! Indeed, in my mind (I am not pretending to speak for the whole PC here!) the main reason is this: ITCS will be single track, it will be relaxed and friendly, and all presentations will be of high quality. We had to make decisions; some of them were right, some were wrong, and many could have been done either way. But we had to make them, and made them for you. So we hope to see you in January 2015 in Israel, and that you’ll enjoy hearing the talks as much as we enjoyed reading and debating the papers!

Posted in Conferences | Tagged , | Leave a comment

Quantum PCP conjectures

Last Spring I took part in the Simons Institute’s semester on Quantum Hamiltonian Complexity. The semester was a great success, with an excellent batch of long-term participants and many fruitful interactions. The Institute asked me to write a short “Research Vignette” presenting, to a broad audience, an example scientific outcome of the programme. You can probably guess what topic I chose for my vignette…I am reproducing it below (the specific project I described led to this preprint). See the Simons’ website for the full newsletter, and in particular the other vignette by Adi Livnat, a participant in the concurrent programme on Evolutionary Biology and the Theory of Computing, and suggestively named “On Sex, Math, and the Origin of Life“.


The macroscopic properties of materials arise from the physical interactions between their constituent particles. For example, the use of the Pauli exclusion principle explains how the conductivity properties of a material follow from the energy band structure of its atoms. Techniques for performing the extrapolation from microscopic laws to macroscopic predictions form the basis of statistical mechanics. The implied statistical averaging, however, is insufficient to predict certain properties, such as superconductivity, for which subtle quantum mechanical effects need to be taken into account. Without the reduction in complexity granted by the application of the law of large numbers, the study of novel phases of matter quickly runs into a fundamental computational bottleneck. The complete description of an n-particle quantum system requires exponentially many parameters: How can one hope to make meaningful predictions if one cannot even write down a full description of the state of the system? Is materials science about to hit a computational barrier?

The Simons semester on Quantum Hamiltonian Complexity brought together theoretical physicists and computer scientists motivated by this problem. The goal of the semester was to formulate, and begin to address, the key complexity-theoretic issues that arise from the study of many-body systems in condensed-matter physics. A central conjecture, the so-called quantum PCP conjecture, crystallizes many of these issues, and the conjecture was hotly debated throughout the semester. During an open problems session held as part of one of the week-long workshops, I was prompted to formulate a new variant of the conjecture whose study led to a fruitful collaboration with Joseph Fitzsimons, another program participant. Before describing this new variant, I will introduce the central computational problem in quantum Hamiltonian complexity and formulate the quantum PCP conjecture.

The local Hamiltonian problem

The computational hardness of making predictions about the global properties of quantum many-body systems is embodied in a single computational problem, the local Hamiltonian problem. This problem, introduced by Kitaev as a direct quantum analogue of classical constraint satisfaction problems, asks whether a given Hamiltonian (i.e., a collection of physical constraints—magnetic field, two-body interactions, etc.) acting on n particles has a lowest-energy state whose energy is below or above a certain threshold. Formally, each constraint is specified by a a positive semidefinite matrix Hj acting on a constant-size subset of the n particles. The energy of a quantum state |ψ〉is its overlap with the total Hamiltonian:

Kitaev showed that the local Hamiltonian problem is complete for the complexity class QMA, the quantum analogue of NP: QMA is the class of decision problems that can be solved by a polynomial-time quantum computer given access to a quantum proof. Kitaev’s seminal result shows that a fundamental problem in physics, estimating the minimal energy of a local Hamiltonian, is computationally intractable. Assuming, as is widely believed, that QMA is a strictly larger class than NP, it also implies that the lowest-energy state does not have an efficient classical description from which its energy can be estimated. In particular, the state must be highly entangled; in other words it must possess complex quantum correlations spread across all its particles. Kitaev’s purely theoretical result thus has direct consequences for the properties of physical states: even the simplest such states (the lowest-energy state is the equilibrium state of the system as the temperature is driven to zero) display the most complex quantum phenomenon, many-body entanglement.

The quantum PCP conjecture

Kitaev’s hardness result applies to inverse-polynomial approximations to the minimal energy. What about weaker approximations? In cases where the energy has an intensive scaling with the system size it is natural to only require an approximation that is proportional to the number of constraints acting on the system. Do such approximations remain hard to obtain, or could they be amenable to efficient, possibly quantum, algorithms?

Asking this question amounts to seeking a quantum analogue of the PCP theorem from classical complexity theory. The PCP theorem, a major breakthrough of the early 90s, asserts that constraint satisfaction problems such as 3SAT remain hard even under very strong promises on the satisfiability of the instance: Given a 3SAT formula, it is NP-hard to estimate the maximum number of clauses that can be simultaneously satisfied up to an accuracy that scales proportionally with the total number of constraints. Since the local Hamiltonian problem contains classical constraint satisfaction problems as a special case, the PCP theorem implies that constant-factor approximations to the minimal energy are NP-hard. The quantum PCP conjecture asks whether the problem is even harder: Could it be QMA-hard?

The status of the conjecture remains completely open. A positive answer would have important implications for the presence of entanglement in so-called Gibbs states, which are equilibrium states of the Hamiltonian at low but positive temperature. Indeed, interest in the conjecture partly stems from its apparent contradiction with the common physical intuition that, as temperature is increased above zero, disorder in the system tends to destroy the coherence required for maintaining entanglement. If this intuition were correct, then low-energy states would have classical descriptions that could be used to contradict the conjecture.

A new variant

Much of the strength of the PCP theorem comes from its versatility, and many equivalent formulations are known. One such formulation, closely tied to the theorem’s origins, uses the language of interactive proof systems. Here one thinks of the verification procedure (for, say, the satisfiability of a 3SAT formula given as input) as an interactive protocol between a polynomial-time verifier and all-powerful but untrusted provers. The PCP theorem implies that any language in NP can be verified through a single round of interaction with two non-communicating provers, to which the verifier sends logarithmic-length messages and receives a constant number of bits as answers. The possibility for such highly efficient proof checking (compared to reading off the linear number of bits describing a satisfying assignment) remains a wonder of modern complexity theory.

Interestingly, this formulation of the PCP theorem gives rise to a quantum analogue whose relation to the quantum PCP conjecture is a priori unclear. The new variant asks the following: Do all languages in QMA admit an efficient verification procedure of the above form, in which the verifier and communication may be quantum? Here it is important to also allow the provers to be initialized in an arbitrary entangled state, as this may be necessary to encode the QMA witness whose existence the provers are supposed to convey to the verifier.

Prompted by program participants Dorit Aharonov and Umesh Vazirani, I formulated this new variant, and the question of its relation to the traditional quantum PCP conjecture, during an open problems session that we had organized as part of the first workshop of the semester. The question immediately generated many ideas and suggestions from the audience. The problem is a tantalizing one. How can the existence of a global, n-qubit state be verified solely through the local snapshots obtained by the verifier in the interactive proof system? In the classical setting, consistency between the provers’ answers and a global proof is ensured by cross-checking the provers’ answers against each other through a complexity-theoretic implementation of the classic prisoner’s dilemmainterrogation technique. In the case of a quantum proof, however, the presence of entanglement renders such cross-checking all but impossible, as the no-cloning principle prevents even honest provers from sharing “consistent” copies of the same quantum state; with a unique copy, the question of who owns which part of the proof becomes essential.

Joseph Fitzsimons, a participant from the Centre for Quantum Technologies in Singapore, suggested a scheme based on encoding the quantum proof using secret-sharing, and distributing the shares equally among the provers. I was excited by his idea, and we attempted to flesh out the details during the semester, continuing by Skype after Joe had left for Singapore. We were recently able to complete the soundness analysis of Joe’s suggested protocol, and our findings establish the second variant of the quantum PCP conjecture as a promising target for future work. Extending the naïve scheme that we have so far been able to analyze in order to obtain a protocol with better soundness guarantees is an exciting open problem. I expect its resolution to require the development of quantum codes with new properties, akin to the locally testable codes used in proofs of the PCP theorem, but possibly with an even more local twist. This would be an exciting outcome, independently of the eventual validity of the conjecture.

These explorations were made possible by the diverse mix of participants in the semester and their willingness to engage with one another. As I was stating my question in the open problems session, encouraged to do so by other participants, I more or less firmly believed that it could only lead to a dead end—but this did not take into account the generous creativity of my fellow participants!

Posted in QPCP, Quantum, Simons | Tagged , , | Leave a comment

QIP 2015 papers

The list of papers accepted to QIP 2015 has just been posted. In an attempt to make sense of the program I scouted for the papers on the arXiv and list what I found below, loosely organized according to an arbitrary categorization. (Please don’t take this too seriously…clearly many papers would have fit in more than one section, in which case I made an arbitrary decision; I also had to stretch some categories a bit in order to accommodate a fairly diverse program.)

As you can see, this year’s edition, at at a standard 39 accepts by my count (including two merges), promises a healthy showing of algorithms, and even more so quantum channel theory. We’ll also be treated to a good dose of (computational and topological aspects of) many-body physics, together with some more exotic (and as of yet undetermined…a few submissions are still missing arXiv links; shame on you!) phases.

We’re promised that the technical abstracts will be made available online. For some reason the PC chair gave us a month to provide updated versions, so we’ll have to wait a bit before we learn what some of the hidden gems of the program are about…

Quantum algorithms and query complexity:

5. Ashley Montanaro. Quantum pattern matching fast on average

12. Francois Le Gall. Improved Quantum Algorithm for Triangle Finding via Combinatorial Arguments

31. Ryan O’Donnell and John Wright. Quantum Spectrum Testing

71. Han-Hsuan Lin and Cedric Yen-Yu Lin. Upper bounds on quantum query complexity inspired by the Elitzur-Vaidman bomb tester

147. Kirsten Eisentraeger, Sean Hallgren, Alexei Kitaev and Fang Song. A quantum algorithm for computing the unit group of an arbitrary degree number field

Error correction:

9. Fernando Pastawski and Beni Yoshida. Fault-tolerant logical gates in quantum error-correcting codes

89. Hector Bombin. Gauge colour codes and 90. Hector Bombin. Single-shot fault-tolerant quantum error correction

151. Courtney Brell. Self-correcting stabilizer quantum memories in 3 dimensions or (slightly) less


99. Henrik Wilming, Rodrigo Gallego and Jens Eisert. Universal operations in resource theories and local thermodynamics

Pseudorandomness and Cryptography:

103. Mario Berta, Omar Fawzi and Volkher Scholz. Quantum-proof randomness extractors via operator space theory

154. Richard Cleve, Debbie Leung, Li Liu and Chunhao Wang. Near-linear construction of exact unitary 2-designs

174. Carl Miller and Yaoyun Shi. Universal security for quantum contextual devices

Communication and channel theory:

28. Stefan Baeuml, Matthias Christandl, Karol Horodecki and Andreas Winter. Limitations on Quantum Key Repeaters

43. Toby Cubitt, David Elkouss, William Matthews, Maris Ozols, David Perez-Garcia and Sergii Strelchuk. Unbounded number of channel uses are required to see quantum capacity

45. Dave Touchette. Direct Sum Theorem for Bounded Round Quantum Communication Complexity and a New, Fully Quantum Notion of Information Complexity

47. Marco Piani and John Watrous. Einstein-Podolsky-Rosen steering provides the advantage in entanglement-assisted subchannel discrimination with one-way measurements

106. William Matthews and Debbie Leung. On the power of PPT-preserving and non-signalling codes

115. Runyao Duan and Andreas Winter. Zero-Error Classical Channel Capacity and Simulation Cost Assisted by Quantum No-Signalling Correlations

Information theory:

54. Isaac Kim. On the informational completeness of local observables

49. Andrea Mari, Vittorio Giovannetti and Alexander S. Holevo. Quantum state majorization at the output of bosonic Gaussian channels

193. Bartek Czech, Patrick Hayden, Nima Lashkari and Brian Swingle. The information theoretic interpretation of the length of a curve

Complexity theory:

63. Joseph Fitzsimons and Thomas Vidick. A multiprover interactive proof system for the local Hamiltonian problem

189. Sergey Bravyi and Matthew Hastings. On complexity of the quantum Ising model

Architectures and implementations:

69. Neil J. Ross and Peter Selinger. Optimal ancilla-free Clifford+T approximation of z-rotations

132. Adam Bouland and Scott Aaronson. Generation of Universal Linear Optics by Any Beamsplitter

191. Joel Wallman and Steve Flammia. Randomized Benchmarking with Confidence


78. Dominic Berry, Andrew Childs and Robin Kothari. Hamiltonian simulation with nearly optimal dependence on all parameters


74. Nicolas Delfosse, Jacob Bian, Philippe Guerin and Robert Raussendorf. Wigner function negativity and contextuality in quantum computation on rebits

77. Salman Beigi and Amin Gohari. Wiring of No-Signaling Boxes Expands the Hypercontractivity Ribbon

111. Rafael Chaves, Christian Majenz, Lukas Luft, Thiago O. Maciel, Dominik Janzing, Bernhard Schölkopf and David Gross. Information-Theoretic Implications of Classical and Quantum Causal Structures

Local Hamiltonians and many-body states:

76. Michael Kastoryano and Fernando Brandao. Quantum Gibbs Samplers: the commuting case

86. Xiaotong Ni, Oliver Buerschaper and Maarten Van Den Nest. A non-commuting Stabilizer Formalism

125. Fernando Brandao and Marcus Cramer. A Berry-Esseen Theorem for Quantum Lattice Systems and the Equivalence of Statistical Mechanical Ensembles

137. Mehmet Burak Şahinoğlu, Dominic Williamson, Nick Bultinck, Michael Marien, Jutho Haegeman, Norbert Schuch and Frank Verstraete. Characterizing Topological Order with Matrix Product Operators and 176. Oliver Buerschaper. Matrix Product Operators: Local Equivalence and Topological Order

139. Ramis Movassagh and Peter Shor. Power law violation of the area law in quantum spin chains

204. Dorit Aharanov, Aram Harrow, Zeph Landau, Daniel Nagaj, Mario Szegedy and Umesh Vazirani. Local tests of global entanglement and a counterexample to the generalized area law


15. Masahito Hayashi. Estimation of group action with energy constraint

Posted in Conferences | Tagged | 2 Comments

Fall teaching

Caltech’s academic year started this week, with Monday being the first day of instruction for the Fall quarter (Caltech’s quarter-based division, as opposed to the more usual division in semesters, explains why we’re starting a month later than many others). So even though I moved to Pasadena in early June, it’s only now that I’m getting to experience the buzzing of a surprisingly lively campus (given the small – less than 1k -undergraduate population), as opposed to the relative quietness of the summer. Having moved in the early summer helped a lot with the settling down, and I now have a functional office (got the last piece – the chair – today), studio (one of the perks of being at Caltech!), and stream of students coming in to ask for signatures :-)

Now, first week of the quarter means, well, let’s get back to business! A lot happened since the last post I wrote on this blog (including many many aborted drafts: when I saw what was the last thing I wrote, I couldn’t believe I hadn’t brought myself to posting anything in the meantime), but I’ll have to settle for the status quo and look forwards, rather than backwards, for regular postings. Caltech’s IQIM alread maintains an excellent blog, to which I’ll attempt to contribute. But this blog will continue to live on for less scientific, or less directly quantum-related posts, so stay tuned! Here’s some brief news on the start of the quarter:

If only I could get inspiration from my predecessors!

So, first week of instruction means, well, first week of instruction. Someone has to teach something to the hungry crowds, and this quarter I’ll bring my own modest contribution by teaching CS286: Topics in Computer Science, that I bravely (foolishly?) decided to entitle “Around the Quantum PCP Conjecture”. The goal of the class is to use the conjecture as a backdrop, a motivating theme; not one to be approached directly but one to suggest interesting questions that can be studied for their own sake.

My plan is to start by reviewing the classical PCP theorem, in enough depth to give a reasonable idea of the kind of ingredients that go into both known proofs of the theorem, though clearly without having the time to cover any of them completely. This week we (almost) covered NP in PCP(poly,1). Next week we’ll see a few of the ideas that go into Dinur’s proof. The next step will be to give an introduction to the area of quantum Hamiltonian complexity, including Kitaev’s QMA-completeness result. We’ll then be in a position to state the conjecture and discuss some of its implications.

Once the background is properly set, I’d like to spend the six remaining weeks of the quarter to dive deep into two or three current topics of research, associated to the conjecture but that we’ll study independently. The three I have in mind are quantum multiprover interactive proofs, area laws, and de Finetti theorems in quantum information theory. Each of these topics would be well worth spending a full semester course on by itself. But instead, I’d like to pick just one particular aspect, one result, and describe the simplest possible non-trivial instantiation of that component or result. The hope is that such a depth-first, rather than breadth-first, exploration, will still give the students some interesting insight into the area and motivate them to continue the exploration on their own.

We’ll have to see how this plays. The first two lectures had a very mixed audience: from the first-year undergraduate (courageous!) to the full professor, going through undergraduate, graduate and postdocs in CS and physics, not a single one of them has a matching background. This was a reason for choosing the “quantum PCP” topic in the first place: my hope is that both CS and Physics students can find some angle of interest on that theme, drawing both crowds around the same preoccupation. It’s going to be challenging to keep everyone together, but it could also be a very fun and rewarding experience…I’m looking forward to it!

Posted in Caltech, teaching | Tagged , , | Leave a comment