[7/9/17 Update: Following a suggestion by Oded Regev I upgraded Section 1 from “probabilistic functions” to “matrix-valued functions”. This hopefully makes it a more useful, and interesting, mid-point between the classical analysis of BLR and the non-abelian extension discussed afterwards. I also fixed a bunch of typos — I apologize for the many remaining ones. The pdf has also been fixed.]
Last week Anand Natarajan from MIT presented our joint work on “A Quantum Linearity Test for Robustly Verifying Entanglement” at the STOC’17 conference in Montreal. Since we first posted our paper on the quant-ph arXiv, Anand and I discovered that the test and its analysis could be reformulated in a more general framework of tests for group relations, and rounding of approximate group representations to exact group representations. This reformulation is stimulated by a beautiful paper by Gowers and Hatami on “Inverse and stability theorems for approximate representations of finite groups”, which was first pointed to me by William Slofstra. The purpose of this post is to present the Gowers-Hatami result as a natural extension of the Blum-Luby-Rubinfeld linearity test to the non-abelian setting, with application to entanglement testing. (Of course Gowers and Hatami are well aware of this — though maybe not of the application to entanglement tests!) My hope in doing so is to make our result more accessible, and hopefully draw some of my readers from theoretical computer science into a beautiful area.
I will strive to make the post self-contained and accessible, with no quantum information background required — indeed, most of the content is purely — dare I say elegantly — mathematical. In the interest of being precise (and working out better parameters for our result than appear in our paper) I include essentially full proofs, though I may allow myself to skip a line or two in some of the calculations.
Given the post remains rather equation-heavy, here is a pdf with the same contents; it may be more convenient to read.
I am grateful to Anand, and Oded Regev and John Wright, for helpful comments on a preliminary version of this post.
1. Linearity testing
The Blum-Luby-Rubinfeld linearity test provides a means to certify that a function is close to a linear function. The test can be formulated as a two-player game:
BLR linearity test:
- (a) The referee selects uniformly at random. He sends the pair to one player, and either , , or (chosen uniformly at random) to the other.
- (b) The first player replies with two bits, and the second player with a single bit. The referee accepts if and only if the player’s answers satisfy the natural consistency constraint.
This test, as all others considered here, treats both players symmetrically. This allows us to restrict our attention to the case of players who both apply the same strategy, an assumption I will systematically make from now on.
Blum et al.’s result states that any strategy for the players in the linearity test must provide answers chosen according to a function that is close to linear. In this section I will provide a slight “matrix-valued” extension of the BLR result, that follows almost directly from the usual Fourier-analytic proof but will help clarify the extension to the non-abelian case.
1.1. Matrix-valued strategies
The “classical” analysis of the BLR test starts by modeling an arbitrary strategy for the players as a pair of functions (for the second player, who receives a single string as query) and (for the first player, who receives a pair of strings as query). In doing so we are making an assumption: that the players are deterministic. More generally, we should allow “probabilistic strategies”, which can be modeled via “probabilistic functions” and respectively, where is an arbitrary probability space which plays the role of shared randomness between the players. Note that the usual claim that “probabilistic strategies are irrelevant because they can succeed no better than deterministic strategies” is somewhat moot here: the point is not to investigate success probabilities — it is easy to pass the BLR test with probability — but rather derive structural consequences from the assumption that a certain strategy passes the test. In this respect, enlarging the kinds of strategies we consider valid can shed new light on the strengths, and weaknesses, of the test.
Thus, and with an eye towards the “quantum” analysis to come, let us consider an even broader set of strategies, which I’ll refer to as “matrix-valued” strategies. A natural matrix-valued analogue of a function is , where is the set of Hermitian matrices that square to identity (equivalently, have all eigenvalues in ); these matrices are called “observables” in quantum mechanics. Similarly, we may generalize a function to a function . Here we’ll impose an additional requirement: any pair in the range of should be such that and commute. The latter condition is important so that we can make sense of the function as a strategy for the provers: we should be able to ascribe a probability distribution on outcomes to any query sent to the players. This is achieved by defining
where for any observable we denote and the projections on the and eigenspaces of , respectively (so and ). The condition that and commute ensures that this expression is always non-negative; moreover it is easy to check that for all it specifies a well-defined probability distribution on . Observe also that in case we recover the classical deterministic case, for which with our notation . If all and are simultaneously diagonal matrices we recover the probabilistic case, with the role of (the shared randomness) played by the rows of the matrices (hence the normalization of ; we will see later how to incorporate the use of non-uniform weights).
With these notions in place we establish the following simple lemma, which states the only consequence of the BLR test we will need.
Lemma 1 Let be an integer, , and and a matrix strategy for the BLR test, such that players determining their answers according to this strategy (specifically, according to (1)) succeed in the test with probability at least . Then
Introducing a normalized inner product on the space of matrices with complex entries (the designates the conjugate-transpose), the conclusion of the lemma is that .
Proof: Success with probability in the test implies the three conditions
To conclude, use the triangle inequality as
where denotes the dimension-normalized Frobenius norm. Expanding each squared norm and using the preceding conditions and for all proves the lemma.
1.2. The BLR theorem for matrix-valued strategies
Before stating a BLR theorem for matrix-valued strategies we need to define what it means for such a function to be linear. Consider first the case of probabilistic functions, i.e. such that all are diagonal, in the same basis. Any such whose every diagonal entry is of the form for some which may depend on the row/column number will pass the BLR test. This shows that we cannot hope to force to be a single linear function, we must allow “mixtures”. Formally, call linear if for some decomposition of the identity, i.e. the are pairwsie orthogonal projections such that . Note that this indeed captures the probabilistic case; in fact, up to a basis change it is essentially equivalent to it. Thus the following may come as a surprise.
Note the role of here, and the lack of control on (more on both aspects later). Even if is a deterministic function , i.e. , the function returned by the theorem may be matrix-valued. In this case the isometry is simply a unit vector , and expanding out the squared norm in the conclusion of the theorem yields the equivalent conclusion
where we expanded using our definition of a linear matrix-valued function. Note that defines a probability distribution on . Thus by an averaging argument there must exist an such that for a fraction at least of all : the usual conclusion of the BLR theorem is recovered.
Proof: The proof of the theorem follows the classic Fourier-analytic proof of Bellare et al. Our first step is to define the isometry . For a vector , define
where is the matrix-valued Fourier coefficient of at and an arbitrary orthonormal basis of . An easily verified extension of Parseval’s formula shows (recall for all ), so that : is indeed an isometry.
Next, define the linear probabilistic function by , where forms a partition of identity. We can evaluate
where the last equality follows by expanding the Fourier coefficients and noticing the appropriate cancellation. Together with (2), this proves the theorem.
At the risk of sounding yet more pedantic, it might be useful to comment on the relation between this proof and the usual argument. The main observation in Bellare et al.’s proof is that approximate linearity, expressed by (2), implies a lower bound on the sum of the cubes of the Fourier coefficients of . Together with Parseval’s formula, this bound implies the existence of a large Fourier coefficient, which identifies a close-by linear function.
The proof I gave decouples the argument. Its first step, the construction of the isometry depends on , but does not use anything regarding approximate linearity. It only uses Parseval’s formula to argue that the isometry is well-defined. A noteworthy feature of this step is that the function on the extended space is always well-defined as well: given a function , it is always possible to consider the linear matrix-valued function which “samples according to ” and then returns . The second step of the proof evaluates the correlation of with the “pull-back” of , and observes that this correlation is precisely our measure of “approximate linearity” of , concluding the proof without having had to explicitly notice that there existed a large Fourier coefficient.
1.3. The group-theoretic perspective
Let’s re-interpret the proof we just gave using group-theoretic language. A linear function is, by definition, a mapping which respects the additive group structure on , namely it is a representation. Since is an abelian group, it has irreducible -dimensional representations, given by the characters . As such, the linear function defined in the proof of Theorem 2 is nothing but a list of all irreducible representations of .
The condition (2) derived in the proof of the theorem can be interpreted as the condition that is an “approximate representation” of . Let’s make this a general definition. For -dimensional matrices and such that is positive semidefinite, write
where we use to denote the conjugate-transpose. The following definition considers arbitrary finite groups (not necessarily abelian).
The condition (3) in the definition is very closely related to Gowers’s norm
While a large Gowers norm implies closeness to an affine function, we are interested in testing linear functions, and the condition (3) will arise naturally from our calculations in the next section.
If , the product should be written additively as , so that the condition (2) is precisely that is an -representation of , where . Theorem 2 can thus be reformulated as stating that for any -approximate representation of the abelian group there exists an isometry and an exact representation of on such that is well-approximated by the “pull-back” of to . In the next section I will make the words in quotes precise and generalize the result to the case of arbitrary finite groups.
2. Approximate representations of non-abelian groups
2.1. The Gowers-Hatami theorem
In their paper Gowers and Hatami consider the problem of “rounding” approximate group representations to exact representations. I highly recommend the paper, which gives a thorough introduction to the topic, including multiple motivations. Here I will state and prove a slightly more general, but quantitatively weaker, variant of their result inspired by the somewhat convoluted analysis of the BLR test given in the previous section.
Gowers and Hatami limit themselves to the case of , which corresponds to the dimension-normalized Frobenius norm. In this scenario they in addition obtain a tight control of the dimension , and show that one can always take in the theorem. I will give a much shorter proof than theirs (the proof is implicit in their argument) that does not seem to allow to recover this estimate. (It is possible to adapt their proof to keep a control of even in the case of general , but I will not explain this here.) Essentially the same proof as the one sketched below has been extended to some classes of infinite groups by De Chiffre, Ozawa and Thom in a recent preprint.
Note that, contrary to the BLR theorem, where the “embedding” is not strictly necessary (if is small enough we can identify a single close-by linear function), as noted by Gowers and Hatami Theorem 4 does not in general hold with . The reason is that it is possible for to have an approximate representation in some dimension , but no exact representation of the same dimension: to obtain an example of this, take any group that has all non-trivial irreducible representations of large enough dimension, and create an approximate representation in e.g. dimension one less by “cutting off” one row and column from an exact representation. The dimension normalization induced by the norm will barely notice this, but it will be impossible to “round” the approximate representation obtained to an exact one without modifying the dimension.
The necessity for the embedding helps distinguish the Gowers-Hatami result from other extensions of the linearity test to the non-abelian setting, such as the work by Ben-Or et al. on non-Abelian homomorphism testing (I thank Oded Regev for pointing me to the paper). In that paper the authors show that a function , where and are finite non-abelian groups, which satisfies , is -close to a homomorphism . The main difference with the setting for the Gowers-Hatami result is that since is finite, Ben-Or et al. use the Kronecker function as distance on . This allows them to employ combinatorial arguments, and provide a rounding procedure that does not need to modify the range space (). In contrast, here the unitary group is infinite.
In case is abelian, we always have , the tensor product is a product, and (4) reduces to the usual definition of Fourier coefficient. The only properties we will need of irreducible representations is that they satisfy the relation
Proof: } As in the proof of Theorem 2 our first step is to define an isometry by
where the direct sum ranges over all irreducible representations of and is the canonical basis. Note what does: it “embeds” any vector into a direct sum, over irreducible representations , of a -dimensional vector of matrices. Each (matrix) entry of this vector can be thought of as the Fourier coefficient of the corresponding entry of the vector associated with . If and ranges over this recovers the isometry defined in the proof of Theorem 2. And indeed, the fact that is an isometry again follows from the appropriate extension of Parseval’s formula:
Following the same steps as in the proof of Theorem 2, we next define
a direct sum over all irreducible representations of (hence itself a representation). Lets’ first compute the “pull-back” of by : following a similar calculation as above, for any ,
where the last equality uses (5). It then follows that
This relates correlation of with to the quality of as an approximate representation and proves the theorem.
2.2. Application: the Weyl-Heisenberg group
In quantum information we care a lot about the Pauli group. For our purposes it will be be sufficient (and much more convenient, allowing us to avoid some trouble with complex conjugation) to consider the Weyl-Heisenberg group , or “Pauli group modulo complex conjugation”, which is the -element group whose multiplication table matches that of the matrices
and . This group has four -dimensional representations, uniquely specified by the image of and in , and a single irreducible -dimensional representation, given by the matrices defined above. We can also consider the “-qubit Weyl-Heisenberg group” , the matrix group generated by -fold tensor products of the matrices identified above. The irreducible representations of are easily computed from those of ; for us the only thing that matters is that the only irreducible representation which satisfies has dimension and is given by the defining matrix representation (in fact, it is the only irreducible representation in dimension larger than ).
With the upcoming application to entanglement testing in mind, I will state a version of Theorem 4 tailored to the group and a specific choice of presentation for the group relations. Towards this we first need to recall the notion of Schmidt decomposition of a bipartite state (i.e. unit vector) . The Schmidt decomposition states that any such vector can be written as
for some orthonomal bases and of (the “Schmidt vectors”) and non-negative coefficients (the “Schmidt coefficients”). The decomposition can be obtained by “reshaping” into a matrix and performing the singular value decomposition. To we associate the (uniquely defined) positive semidefinite matrix
Corollary 5 Let be integer, , a unit vector, the positive semidefinite matrix associated to as in (8), and . For let , , and assume for all (we call such operators, unitaries with eigenvalues in , observables). Suppose that the following inequalities hold: consistency
Note that the conditions (10) and (11) in the corollary are very similar to the conditions required of an approximate representation of the group ; in fact it is easy to convince oneself that their exact analogue suffice to imply all the group relations. The reason for including only those relations is that they are the ones that it will be possible to test; see the next section for this. Condition (9) is necessary to derive the conditions of Theorem 4 from (10) and (11), and is also testable; see the proof.
Proof: To apply Theorem 4 we need to construct an -representation of the group . Using that any element of has a unique representative of the form for , we define . Next we need to verify (3). Let be such that and for -bit strings and respectively. Up to phase, we can exploit successive cancellations to decompose as
(It is worth staring at this sequence of equations for a little bit. In particular, note the “player-switching” that takes place in the 2nd, 4th and 6th lines; this is used as a means to “commute” the appropriate unitaries, and is the reason for including (9) among the assumptions of the corollary.) Evaluating each term on the vector , taking the squared Euclidean norm, and then the expectation over uniformly random , the inequality and the assumptions of the theorem let us bound the overlap of each term in the resulting summation by . Using by definition, we obtain the bound
We are thus in a position to apply Theorem 4, which gives an isometry and exact representation such that
Using that is a representation, . It follows from (12) that , so we may restrict the range of to the subspace where without introducing much additional error.
3. Entanglement testing
Our discussion so far has barely touched upon the notion of entanglement. Recall the Schmidt decopmosition (7) of a unit vector , and the associated reduced density matrix defined in (8). The state is called entangled if this matrix has rank larger than ; equivalently, if there is more than one non-zero coefficient in (7). The Schmidt rank of is the rank of , the number of non-zero terms in (7). It is a crude, but convenient, measure of entanglement; in particular it provides a lower bound on the local dimension . A useful observation is that the Schmidt rank is invariant under local unitary operations: these may affect the Schmidt vectors and , but not the number of non-zero terms.
3.1. A certificate for high-dimensional entanglement
Among all entangled states in dimension , the maximally entangled state is the one which maximizes entanglement entropy, defined as the Shannon entropy of the distribution induced by the squares of the Schmidt coefficients:
with entropy . The following lemma gives a “robust” characterization of the maximally entangled state in dimension as the unique common eigenvalue- eigenvector of all operators of the form , where ranges over the elements of the unique -dimensional irreducible representation of the Weyl-Heisenberg group , i.e. the Pauli matrices (taken modulo ).
Proof: Consider the case . The “swap” matrix
squares to identity and has a unique eigenvalue- eigenvector, the vector (a.k.a. “EPR pair”). Thus implies . The same argument for general shows . Any unit vector of Schmidt rank at most satisfies , concluding the proof.
Lemma 6 provides an “experimental road-map” for establishing that a bipartite system is in a highly entangled state:
- (i) Select a random ;
- (ii) Measure both halves of using ;
- (iii) Check that the outcomes agree.
To explain the connection between the above “operational test” and the lemma I should review what a measurement in quantum mechanics is. For our purposes it is enough to talk about binary measurements (i.e. measurements with two outcomes, and ). Any such measurement is specified by a pair of orthogonal projections, and , on such that . The probability of obtaining outcome when measuring is . We can represent a binary measurement succinctly through the observable . (In general, an observable is a Hermitian matrix which squares to identity.) It is then the case that if an observable is applied on the first half of a state , and another observable is applied on the second half, then the probability of agreement, minus the probability of disagreement, between the outcomes obtained is precisely , a number which lies in . Thus the condition that the test described above accepts with probability when performed on a state is precisely equivalent to the assumption (13) of Lemma 6.
Even though this provides a perfectly fine test for entanglement in principle, practitioners in the foundations of quantum mechanics know all too well that their opponents — e.g. “quantum-skeptics” — will not be satisfied with such an experiment. In particular, who is to guarantee that the measurement performed in step (ii) is really , as claimed? To the least, doesn’t this already implicitly assume that the measured system has dimension ?
This is where the notion of device independence comes in. Briefly, in this context the idea is to obtain the same conclusion (a certificate of high-dimensional entanglement) without any assumption on the measurement performed: the only information to be trusted is classical data (statistics generated by the experiment), but not the operational details of the experiment itself.
This is where Corollary 5 enters the picture. Reformulated in the present context, the corollary provides a means to verify that arbitrary measurements “all but behave” as Pauli measurements, provided they generate the right statistics. To explain how this can be done we need to provide additional “operational tests” that can be used to certify the assumptions of the corollary.
3.2. Testing the Weyl-Heisenberg group relations
Corollary 5 makes three assumptions about the observables and : that they satisfy approximate consistency (9), linearity (10), and anti-commutation (11). In this section I will describe two (somewhat well-known) tests that allow to certify these relations based only on the fact that the measurements generate statistics which pass the tests.
- (a) The referee selects and uniformly at random. He sends to one player and , , or to the other.
- (b) The first player replies with two bits, and the second with a single bit. The referee accepts if and only if the player’s answers are consistent.
As always in this note, the test treats both players simultaneously. As a result we can (and will) assume that the player’s strategy is symmetric, and is specified by a permutation-invariant state and a measurement for each question: an observable associated to questions of the form , and a more complicated four-outcome measurement associated with questions of the form (It will not be necessary to go into the details of the formalism for such measurements).
The linearity test described above is exactly identical to the BLR linearity test described earlier, but for the use of the basis label . The lemma below is a direct analogue of Lemma 1, which extends the analysis to the setting of players sharing entanglement. The lemma was first introduced in a joint paper with Ito, where we used an extension of the linearity test, Babai et al.’s multilinearity test, to show the inclusion of complexity classes NEXPMIP.
Lemma 7 Suppose that a family of observables for and , generates outcomes that succeed in the linearity test with probability , when applied on a bipartite state . Then the following hold: approximate consistency
Testing anti-commutation is slightly more involved. We will achieve this by using a two-player game called the Magic Square game. This is a fascinating game, but just as for the linearity test I will treat it superficially and only recall the part of the analysis that is useful for us (see e.g. the paper by Wu et al. for a description of the game and a proof of Lemma 8 below).
Lemma 8 (Magic Square) The Magic Square game is a two-player game with nine possible questions (with binary answers) for one player and six possible questions (with two-bit answers) for the other player which has the following properties. The distribution on questions in the game is uniform. Two of the questions to the first player are labelled and respectively. For any strategy for the players that succeeds in the game with probability at least using a bipartite state and observables and for questions and respectively, it holds that
Based on the Magic Square game we devise the following “anti-commutation test”.
- (a) The referee selects uniformly at random under the condition that . He plays the Magic Square game with both players, with the following modifications: if the question to the first player is or he sends or instead; in all other cases he sends the original label of the question in the Magic Square game together with both strings and .
- (b) Each player provides answers as in the Magic Square game. The referee accepts if and only if the player’s answers would have been accepted in the game.
Using Lemma 8 it is straightforward to show the following.
Lemma 9 Suppose a strategy for the players succeeds in the anti-commutation test with probability at least , when performed on a bipartite state . Then the observables and applied by the player upon receipt of questions and respectively satisfy
3.3. A robust test for high-dimensional entangled states
We are ready to state, and prove, our main theorem: a test for high-dimensional entanglement that is “robust”, meaning that success probabilities that are a constant close to the optimal value suffice to certify that the underlying state is within a constant distance from the target state — in this case, a tensor product of EPR pairs. Although arguably a direct “quantization” of the BLR result, this is the first test known which achieves constant robustness — all previous -qubit tests required success that is inverse polynomially (in ) close to the optimum in order to provide any meaningful conclusion.
-qubit Pauli braiding test: With probability each,
- (a) Execute the linearity test.
- (b) Execute the anti-commutation test.
Theorem 10 Suppose that a family of observables , for and , and a state , generate outcomes that pass the -qubit Pauli braiding test with probability at least . Then .
As should be apparent from the proof it is possible to state a stronger conclusion for the theorem, which includes a characterization of the observables and the state up to local isometries. For simplicity I only recorded the consequence on the dimension of .
Proof: Using Lemma 7 and Lemma 9, success with probability in the test implies that conditions (9), (10) and (11) in Corollary 5 are all satisfied, up to error . (In fact, Lemma 9 only implies (11) for strings such that . The condition for string such that follows from the other conditions.) The conclusion of the corollary is that there exists an isometry such that the observables and satisfy
Using again the consistency relations (9) that follow from part (a) of the test together with the above we get
Applying Lemma 6, has Schmidt rank at least . But is a local isometry, which cannot increase the Schmidt rank.