387x Filetype PDF File size 0.32 MB Source: www.cs.tau.ac.il
TheApproximateRankofaMatrixanditsAlgorithmic
Applications
NogaAlon∗ Troy Lee Adi Shraibman
Schools of Mathematics and Centre for Quantum School of Computer Science
Computer Science Technologies Academic College of Tel-Aviv
Tel Aviv University Singapore Yaffo, Israel
Tel Aviv 69978, Israel troyjlee@gmail.com adish@mta.ac.il
nogaa@tau.ac.il
†
Santosh Vempala
School of Computer Science
Georgia Tech
Atlanta 30332
vempala@gatech.edu
ABSTRACT Keywords
We study the ǫ-rank of a real matrix A, defined for any Approximate rank, Nash equilibria, covering number, con-
ǫ > 0 as the minimum rank over matrices that approxi- vex body
mate every entry of A to within an additive ǫ. This pa-
rameter is connected to other notions of approximate rank 1. INTRODUCTION
and is motivated by problems from various topics includ-
ing communication complexity, combinatorial optimization, 1.1 Background
game theory, computational geometry and learning theory.
Here we give bounds on the ǫ-rank and use them for algo- Alargebodyofworkintheoreticalcomputersciencedeals
rithmic applications. Our main algorithmic results are (a) with various ways of approximating a matrix by a simpler
polynomial-time additive approximation schemes for Nash one. The motivation from the design of approximation algo-
equilibria for 2-player games when the payoff matrices are rithms is clear. When the input to a computational problem
positive semidefinite or have logarithmic rank and (b) an is a matrix (that may represent a weighted graph, a payoff
additive PTAS for the densest subgraph problem for similar matrix in a two-person game or a weighted constraint sat-
classes of weighted graphs. We use combinatorial, geometric isfaction problem), the hope is that it is easier to solve or
and spectral techniques; our main new tool is an algorithm approximately solve the computational problem for the ap-
for efficiently covering a convex body with translates of an- proximating matrix, which is simpler. If the notion of ap-
other convex body. proximation is suitable for the problem at hand, then the
solution will be an approximate solution for the original in-
put matrix as well.
General Terms A typical example of this reasoning is the application of
cut decomposition and that of regular decomposition of ma-
Algorithms, Theory trices. The cut-norm of a matrix A with a set of rows R and
a set of columns C is
∗ max | X A |.
Research supported in part by an ERC Advanced grant, by S⊂R,T⊂T ij
a USA-Israeli BSF grant, and by the Hermann Minkowski i∈S,j∈T
Minerva Center for Geometry in Tel Aviv University. A cut matrix B(S,T;r) is a matrix B for which B = r
†Supported in part by NSF awards CCF-0915903 and CCF- ij
iff i ∈ S,j ∈ T and B =0otherwise. Frieze and Kannan
1217793. ij
showed that any n by m matrix B with entries in [−1,1]
can be approximated by a sum of at most 1 cut matrices,
ǫ2
in the sense that the cut norm of the difference between B
and this sum is at most ǫmn. Such an approximation can
Permission to make digital or hard copies of all or part of this work for be found efficiently and leads to several approximation algo-
personal or classroom use is granted without fee provided that copies are rithms for dense graphs—see [20]. Similar approximations of
not made or distributed for profit or commercial advantage and that copies matrices can be given using variants of the regularity lemma
bear this notice and the full citation on the first page. To copy otherwise, to of Szemer´edi. These provide a more powerful approximation
republish, to post on servers or to redistribute to lists, requires prior specific at the expense of increasing the complexity of the approx-
permission and/or a fee. imating matrix, and supply approximation algorithms for
STOC’13,June1–4,2013,PaloAlto, California, USA.
Copyright 2013 ACM 978-1-4503-2029-0/13/06 ...$15.00. additional problems see [6, 19, 7].
All these methods, however, deal with global properties, Boolean function is bounded from below by logǫ-rank(A)/2
as the approximation obtained by all these variants of the where A is the communication matrix of the function. Using
regularity lemma are not sensitive to local changes in the this connection, it has been proven that for fixed ǫ the 2n
√
matrix. In particular, these methods cannot provide ap- by 2n set-disjointness matrix has ǫ-rank 2Θ( n), using the
proximate solutions to problems like that of finding an ap- quantumprotocolofAaronsonandAmbainis[1]fordisjoint-
proximate Nash equilibrium in a two person game, or that of ness, and Razborov’s lower bound for the problem (see [36],
approximating the maximum possible density of a subgraph where there is a lower bound for the trace-norm of any ma-
√
on, say, nvertices in a given n vertex weighted graph. Mo- trix that ǫ-approximates the disjointness matrix restricted
tivated by applications of this type, it is natural to consider to the sets of size n/4.) In another work, Lee and Shraib-
a stronger notion of approximation of a matrix, an approx- man[29] have given algorithmic bounds on ǫ-rank via the γ2
imation in the infinity norm, by a matrix of low rank. This norm. The γ -norm of a real matrix A, denoted by γ (A), is
2 2
motivates the following definition. the minimum possible value of the product c(X)c(Y), where
c(Z) is the maximum ℓ -norm of a column of a matrix Z,
Definition 1.1. For a real n×n matrix A, the ǫ-rank of 2
and the minimum is taken over all factorizations of A of the
Ais defined as follows: form A = XtY. For a sign matrix A and for α ≥ 1, let
n×n γα(A) denote the minimum possible value of γ (B), where
ǫ-rank(A) = min{rank(B) : B ∈ ℜ , kA −Bk ≤ǫ}. 2 2
∞ Branges over all matrices of the same dimension as A that
We will usually assume that the matrix A has entries in satisfy 1 ≤ A ·B ≤αforalladmissible i,j. Let rank (A)
ij ij α
[−1,1], but the definition holds for any real matrix. Define denote the minimum possible rank of such a matrix B. Note
the density norm of a matrix A to be that for α = 1 + ǫ this is (roughly) the ǫ/2-rank of A. In
[29] it is shown that rank (A) and γα(A) are polynomially
|xTAy| α 2
den(A) = max . related for any sign matrix A, up to a poly-logaritmic fac-
x,y∈ℜn kxk1kyk1 α
+ tor in the dimension of A. Since γ2 (A) can be computed
It is easy to verify that the following definition of ǫ-rank is efficiently using semi-definite programming, this provides a
equivalent to Definition 1.1: (rough) approximation algorithm for the ǫ-rank of a given
n×n sign matrix.
ǫ-rank(A) = {minrank(B) : B ∈ ℜ , den(A−B) ≤ ǫ}. For the special case of the n by n identity matrix the
The investigation of notions of simple matrices that ap- ǫ-rank has been studied and provided several applications.
In [2] it is shown that it is at least Ω( logn ) and at most
proximate given ones is motivated not only by algorithmic ǫ2 log(1/ǫ)
applications, but by applications in complexity theory as O(logn). This is used in [2] to derive several applications
ǫ2
well. Following Valiant [45] call a matrix A (r,s)-rigid if for in geometry, coding theory, extremal finite set theory and
any matrix B of the same dimensions as A and rank at most the study of sample spaces supporting nearly independent
r, A−B contains a row with at least s nonzero entries. Here randomvariables. See also [13] for a more recent application
√
the notion of simple matrix is thus a matrix of low rank, and of the lower bound (for the special case ǫ < 1/ n) in com-
the notion of approximation is to allow a limited number of binatorial geometry and in the study of locally correctable
changes in each row. Valiant proved that if an n by n matrix codes over real and complex numbers.
is (Ω(n),nΩ(1))-rigid, then there is no arithmetic circuit of The notion of the ǫ-rank of a matrix is also related to
linear size and logarithmic depth that computes Ax for any learning and to computational geometry. Indeed, the prob-
given input x. Therefore, the main problem in this context lem of computing the ǫ-rank of a given n by n matrix A is
(which is still wide open) is to give an explicit construction equivalent to the geometric problem of finding the minimum
of such a rigid matrix. possible dimension of a linear subspace of Rn that intersects
Another notion that received a considerable amount of the aligned cubes of edge length 2ǫ centered at the columns
attention is the sign-rank of a real matrix. For a matrix of A. In learning, the problem of learning with margins, the
A, rank±(A) is defined as the minimum rank over matrices fat-shattering dimension of a family of functions and the
each of whose entries has the same sign as the correspond- problem of learning functions approximately are all related
ing entry in the original matrix. The notion of approxima- to this notion (see e.g., [4] and the references therein).
tion here refers to keeping the signs of the entries, while the Thefocusofourpaperisalgorithmicapplicationsofǫ-rank;
simplicity of the approximating matrix is measured by its we state our results in the next section.
rank. The sign rank has played a useful role in the study of 1.2 Results
the unbounded error communication complexity of Boolean
functions (see, e.g., [5, 21] and their references), in establish- We begin with bounds on the ǫ-rank. A well known re-
ing lower bounds in learning theory and in providing lower sult of Forster [21] asserts that the sign-rank of any n by n
√
bounds for the size of threshold-of-majority circuits com- Hadamard matrix H is at least Ω( n). This clearly implies
puting a function in AC0 (see [37]). It is clear that for −1,1 the same lower bound for the ǫ-rank of any such matrix for
matrices or matrices with entries of absolute value exceeding any ǫ < 1. Using the approximate γ norm, Linial et al.
2
ǫ, ǫ-rank(A) ≥ rank (A), and simple examples show that in [30] further show that ǫ-rank(H) ≥ (1−2ǫ)n. The following
± gives a slightly stronger estimate, which is tight.
many cases the ǫ-rank is far larger than the sign-rank.
Another line of work on ǫ-rank is motivated by commu- Theorem 1.2. For any n × n Hadamard matrix H and
nication complexity [17, 29, 36]. The ǫ-error private-coin any 0 < ǫ < 1, ǫ-rank(H) ≥ (1−ǫ2)n.
communicationcomplexityofaBooleanfunctionisbounded
from below by the logarithm of the ǫ-rank of the correspond- Next we show a lower bound on the approximate rank of
ing communication matrix (see [28]). In addition, the ǫ- a random d-regular graph. Let A denote the adjacency
G
¯
error private-coin quantum communication complexity of a matrix of a graph G, and A denote the“signed”adjacency
G
matrix where the (i,j) entry is 1 for an edge and −1 for a can be computed using linear programming. The setting
non-edge. We show, in fact, a stronger statement that for a of A + B having constant rank was investigated by Kan-
¯
random d-regular graph the sign rank of A is Ω(d). nan and Theobald [27], who gave a PTAS for the case when
G
Aclosely related result was shown by Linial and Shraib- the rank is a constant. Their algorithm has running time
man [31]. Similarly to the sign rank, define γ∞(A) as the poly(d,1/ǫ) d
2 n ; ours has complexity [O(1/ǫ)] poly(n), giving a
minimumγ normofamatrixthathasthesamesignpattern
2 PTAS for ǫ-rank d = O(logn).
as A and all entries at least 1 in magnitude. It is known that
rank (A) = O(γ∞(A)log(mn)) [14] and also that the sign Theorem 1.6. Let A,B ∈ [−1,1]n×n be the payoff ma-
± 2 ∞ trices of a 2-player game. If A + B is positive semidefinite,
rank can be exponentially smaller than γ2 [16, 42]. Linial
∞ √ then an ǫ-Nash equilibrium of the game can be computed by
and Shraibman show that γ (A ) = Ω( d) for a random
2 G
d-regular graph, which also implies an Ω(d) lower bound on a Las Vegas randomized algorithm using poly(n) space and
the ǫ-approximate rank for any constant ǫ < 1/2. expected time
Our proof is different from these γ techniques and relies, 2
2 O(log(1/ǫ)/ǫ )
as in previous lower bounds on the sign rank [5], on Warren’s n
theorem from real algebraic geometry [47]. i.e., there is a PTAS to compute an ǫ-Nash equilibrium.
Theorem 1.3. For almost all d-regular graphs G on n The above theorem can be recovered by a similar algorithm
¯ using the γ2-norm approach.
vertices rank (A ) = Ω(d) for the adjacency matrix of G.
± G The next theorem is more general (the γ approach seems
2
By“almostall”wemeanthatthefractionofd-regulargraphs to achieve only a weaker bound of (n/ǫ)O(d) here).
on n vertices for which the statement holds tends to 1 as n n×n
¯ Theorem 1.7. Let A,B ∈ [−1,1] be the payoff ma-
tends to infinity. Note that as A =2A −J,whereJ isthe
G G
all ones matrix, this result also implies that ǫ-rank(A ) = trices of a 2-player game. Suppose (ǫ/2)-rank(A + B) =
G
Ω(d) for any ǫ < 1/2. The lower bound here is tight for d and suppose we have a matrix C of rank d satisfying
kA+B−Ck ≤ǫ/2. Then, an ǫ-Nash equilibrium of the
ǫ-rank up to a logn factor: it follows from results in [30, 29] ∞
that for any fixed ǫ bounded away from zero the ǫ-rank of game can be computed by a Las Vegas randomized algorithm
A foreveryd-regulargraphonnverticesisO(dlogn). The using poly(n) space and expected time
G
lower bound is tight for sign rank, as a result of [5] implies
1 O(d)
that the sign rank is bounded by the maximal number of ǫ poly(n).
sign changes in each row.
The ǫ-rank of any positive semidefinite matrix can be Notethatinparticular if the rank of A+B is d ≤ O(logn)
bounded from above as stated in the next theorem. The we can simply take C = A + B and get a polynomial time
theorem follows via a direct application of the Johnson- algorithm.
Lindenstrauss lemma [26]. Our second application is to finding an approximately
Theorem 1.4. Forasymmetricpositivesemi-definite n× densest (biparite) subgraph, a problem that has thus far
n matrix A with |A | ≤ 1, we have evaded a PTAS even in the dense setting. In fact, there are
ij hardness results indicating that even the dense case is hard
ǫ-rank(A) ≤ 9logn. to approximate to within any constant factor, see [3]. Here
ǫ2 −ǫ3 we observe that we can get efficiently a good additive ap-
Note that this is nearly tight, by the above mentioned lower proximation for the special case that the input matrix has a
bound for the ǫ-rank of the identity matrix. small ǫ-rank (assuming we are given an approximating ma-
The last theorem can be extended to linear combinations trix of low rank). For a matrix A with entries in [0,1] and
subsets S,T of rows and columns, let A be the submatrix
of positive semi-definite (=PSD) matrices. S,T
induced by S and T. The density of the submatrix A is
P S,T
Corollary 1.5. Let A = m αB where |α | ≤ 1 are P
i=1 i i i A
scalars and Bi are n×n PSD matrices with entries at most density(A ) = i∈S,j∈T ij
1 in magnitude. Then S,T |S||T|
2logn the average of the entries of the submatrix.
ǫ-rank(A) ≤ Cm ǫ2 Theorem 1.8. Let A be an n×n real matrix with entries
for an absolute constant C. in [0,1]. Then for any integer 1 ≤ k ≤ n, there is a Las
Theresultsaboveprovideseveralalgorithmicapplications. Vegas randomized algorithm to find subsets S,T of rows and
Our first application is finding approximate Nash equilibria columns with |S| = |T| = k s.t.,
in 2-player games. Lipton et al. [32] showed that an ǫ-Nash density(A ) ≥ max density(A ) −ǫ.
S,T U,V
2 |U|=|V|=k
O(logn/ǫ )
for any 2-player game can be computed in time n
2
and it has been an important open question to determine O(log(1/ǫ)/ǫ )
Its expected time complexity is bounded by n if
whether this problem has a PTAS (i.e., an algorithm of com- A is PSD and by 1 O(d)poly(n) where d = (ǫ/2)-rank(A)
plexity of nf(ǫ)). Our result establishes a PTAS when A+B ǫ
(and we are given an ǫ/2-approximation of rank d), and its
is PSD or when A+B has ǫ-rank O(logn) (and we are given space complexity is poly(n).
an ǫ-approximating matrix C of A+B with rank O(logn)),
where A and B are the payoff matrices of the game. Note Note that this is a bipartite version of the usual densest
that the special case when A + B = 0 corresponds to zero- subgraph problem in which the objective is to find the den-
sum games, a class for which the exact Nash equilibrium sity of the densest subgraph on (say) 2k vertices in a a given
(possibly weighted) input graph. It is easy to see that the where the minimum is taken over all subspaces U of dimen-
answers to these two problems can differ by at most a factor sion m−i+1. Similarly,
of 2, and as the best known polynomial time approximation T
λ (C) = min max x Cx.
algorithm for this problem, given in [15], only provides an j
1/4+o(1) W, dim(W)=m−j+1x∈W,kxk=1
O(n )-approximation for an n-vertex graph, this bi- Put V = U ∩ W. Clearly, the dimension of V is at least
partite version also appears to be very difficult for general m−i−j+2andforanyx∈V,kxk=1,
graphs.
Ourresults for general matrices are based on an algorithm T T T
x (B+C)x=x Bx+x Cx≤λ(B)+λ (C).
to efficiently find a near-optimal cover of a convex body A i j
Therefore, λ (B+C)is equal to
bytranslates of another convex body B. We state this result i+j−1
here as it seems to be of independent interest. Let N(A,B) T
min max x (B+C)x≤λ(B)+λ (C).
denote the minimum number of translates of B required to i j
cover A. Z, dim(Z)=m−i−j+2x∈Z,kxk=1
✷
Theorem 1.9. For any two centrally symmetric convex Proof. (of Theorem 1.2). Let E be an n by n matrix,
bodies A,B in ℜd, a cover of A using translates of B of size
O(d) kEk∞ ≤ǫsothatthe rank of H+E is the ǫ-rank of H. Let
N(A,B)2 can be enumerated by a Las Vegas randomized BandC bethefollowing two symmetric 2n by 2n matrices
O(d)
algorithm with expected running time N(A,B)2 and us-
ing space poly(d). B= 0 H , (2)
HT 0
In essence, this theorem allows us to find and use covers of
O(d) O(d)
size (1/ǫ) rather than (d/ǫ) that can be constructed
more easily. Although we do not prove it here, we expect C= 0 E . (3)
that the theorem can be extended to asymmetric convex ET 0
bodies. Since H is a Hadamard matrix, BTB = B2 is n times the
2
1.3 Organization 2n by 2n identity matrix. It follows that λ (B) = n for all i,
i
The rest of the paper is organized as follows. In Section and by Lemma 2.1, part (i), exactly n eigenvalues of B are
√ √ √
n and exactly n are − n. In particular λ (B) = − n.
2 we describe lower and upper bounds for the ǫ-rank of a n+1
matrix, presenting the proofs of Theorems 1.2, 1.3, 1.4 and The absolute value of every entry of E is at most ǫ, and
2 2
Corollary 1.5. In Section 3 we describe efficient construc- thus the square of the Frobenuis norm of C is at most 2n ǫ .
As this is the trace of CTC, that is, the sum of squares of
tions of ǫ-nets which are required to derive the algorithmic 2
applications, and prove Theorem 1.9. Section 4 contains the eigenvalues of C, it follows that C has at most 2nǫ eigenval-
√
algorithmic applications including the proofs of Theorems ues of absolute value at least n. By Lemma 2.1, part (i),
2
1.6, 1.7 and 1.8. The final Section 5 contains some conclud- this implies that there are at most ǫ n eigenvalues of C of
√ √
value at least n and thus λ 2 (C) < n. By Lemma
ing remarks, open problems and plans for future extensions. ⌊ǫ n⌋+1
2.1, part (ii) we conclude that λ 2 (B + C) < 0.
n+⌊ǫ n⌋+1
2. BOUNDSONǫ-RANK Therefore B + C has at least n − ⌊ǫ2n⌋ negative eigenval-
ues and hence also at least n − ⌊ǫ2n⌋ positive eigenvalues.
2.1 Lowerbounds Therefore its rank, which is exactly twice the rank of H+E,
is at least 2(n − ⌊ǫ2n⌋), completing the proof. ✷
Westart with the proof of Theorem 1.2, which is based on
some spectral techniques. For a symmetric m by m matrix 1
Alet λ (A) ≥ λ (A) ≥ ... ≥ λ (A) denote its eigenvalues, Remark: By the last theorem, if ǫ < √n then the ǫ-rank
1 2 m of any n by n Hadamard matrix is n. This is tight in the
ordered as above. We need the following simple lemma. sense that for any n which is a power of 4 there is an n by
Lemma 2.1. (i) Let A be an n by n real matrix, then the 1
n Hadamard matrix with ǫ-rank n−1 for ǫ = √n. Indeed,
2n eigenvalues of the symmetric 2n by 2n matrix the matrix
B= 0 A (1) +1 +1 +1 −1
T
A 0 +1 −1 +1 +1
H1 = +1 +1 −1 +1 (4)
appear in pairs λ and −λ. +1 −1 −1 −1
(ii) For any two symmetric m by m matrices B and C and
all admissible values of i and j is a 4 by 4 Hadamard matrix in which the sum of elements in
λ (B+C)≤λ(B)+λ (C). every row is of absolute value 2. Thus, the tensor product of
i+j−1 i j k copies of this matrix is a Hadamard matrix of order n = 4k
Proof. (i) Let (x,y) be the eigenvector of an eigenvalue in which the sum of elements in every row is in absolute value
λ of B, where x and y are real vectors of length n. Then k 1 1
T 2 . We can thus either add or subtract k = √ to each
Ay =λx and A x=λy. It is easy to check that the vector 2 n
(x,−y) is an eigenvector of the eigenvalue −λ of B. This element of this matrix and get a matrix in which the sum of
proves part (i). elements in every row is 0, that is, a singular matrix.
(ii) The proof is similar to that of the Weyl Inequalities - c.f., In a similar vein, we can show that the ǫ-rank of any
e.g., [23], and follows from the variational characterization n-by-n matrix A with entries in [−1,1] is at most n − 1
6
of the eigenvalues. It is easy and well known that for ǫ = √ . Indeed, for any such A = (aij) there are, by
n
T the main result of [44], δj ∈ {−1,1} so that for every i,
λi(B) = min max x Bx Pn √
| a δ | ≤ 6 n. Fix such δ and define, for each i,
U, dim(U)=m−i+1x∈U,kxk=1 j=1 ij j j
no reviews yet
Please Login to review.