Timings : Wednesdays, 11:00 am - 12:00 noon
Venue : CoRE 301
Organizers : Eric Allender, Pranjal Awasthi, Michael Saks and Mario Szegedy
Seminar webpages for previous semesters can be found here.
In 1989 it was shown by Allender and Hertrampf that every circuit of depth $d$ and gates AND,OR,NOT, and MODp can be reduced to a depth 3 circuit of size $2^{(\log n)^{O(d)}}$. The question about MODm gates was handled a year later by Yao, and subsequently by Beigel and Tarui, with a triple-exponentially size bound, i.e. $2^{(\log n)^{2^{O(d)}}}$.
We resolve the question for composites obtaining the same asymptotic result as Allender-Hertampf.
Depth reduction is a fundamental question on its own. It also has significant implcations. For example, one of its immediate consequences is an exponential depth-improvement in Williams' program for separations of NEXP.
This is joint work with Shiteng Chen.
We give an efficient structural decomposition theorem for formulas that depends on their negation complexity and demonstrate its power with the following applications.
Joint work with Ilan Komargodski.
We characterize the streaming space complexity of every symmetric norm $\ell$ (a norm on $R^n$ invariant under sign-flips and coordinate-permutations), by relating this space complexity to the measure-concentration characteristics of $\ell$. Specifically, we provide matching upper and lower bounds (up to polylog factors) on the space complexity of approximating the norm of the stream, where both bounds depend on the median and maximum of $\ell(x)$ when $x$ is drawn uniformly from the $\ell_2$ unit sphere. The same quantity governs many phenomena in high-dimensional spaces, such as large-deviation bounds and the critical dimension in Dvoretzky's Theorem. The family of symmetric norms contains several well-studied norms, such as all $\ell_p$ norms, and indeed we provide a new explanation for the disparity in space complexity between $p \le 2$ and $p>2$. In addition, we apply our general results to easily derive bounds for several norms that were not studied before in the streaming model, including for example the top-$k$ norm and the $k$-support norm, which was recently shown to be effective for machine learning tasks.
Joint work with:
Jaroslaw Blasiok, Vladimir Braverman, Stephen R. Chestnut, Robert Krauthgamer
The current version of the paper can be found at: http://arxiv.org/abs/1511.01111
In this talk we consider Boolean circuits over the full binary basis. It is known that almost all Boolean functions of $n$ variables require circuits of size $\Theta(2^n/n)$. At the same time, the best known lower bound of $3n-o(n)$ for an explicit Boolean function (that is, a function from NP) was given by Blum in 1984. This bound, as well as most of the previous bounds, is based on a simple method called gate elimination. The idea of the method is as follows: roughly, to prove a lower bound of $cn$ for a function from a certain class one shows that in any circuit computing a function from the class one can find a substitution of a constant to an input variable that eliminates at least $c$ gates from a circuit and keeps the function in the same class; the bound then follows by induction.
In this talk, we discuss generalizations of gate elimination: we consider more involved substitutions, circuit complexity measures, and generalized circuit models that simplify and improve the analysis.
We show that affine dispersers are not computable by circuits of size $3.01n$. We also show that an explicit construction of dispersers for quadratic varieties (currently unknown) implies stronger lower bounds.
The talk is based on joint works with Magnus G. Find, Edward A. Hirsch, and Alexander S. Kulikov:
http://eccc.hpi-web.de/report/2015/166/
http://eccc.hpi-web.de/report/2015/170/
As noted by Lovasz and Plummer in their classic book:
"Matching Theory is a central part of graph theory, not only because of its applications, but also because it is the source of important ideas developed during the rapid growth of combinatorics during the last several decades.'' We consider variants of matching in data streams when the insertion and the deletion of edges are revealed in a streaming fashion. In particular,We formulate the conditional Kolmogorov complexity of $x$ given $y$ at precision $r$, where $x$ and $y$ are points in Euclidean spaces and $r$ is a natural number. We demonstrate the utility of this notion in two ways.
1. We prove a point-to-set principle that enables one to use the (relativized, constructive) dimension of a single point in a set $E$ in a Euclidean space to establish a lower bound on the (classical) Hausdorff dimension of $E$. We then use this principle, together with conditional Kolmogorov complexity in Euclidean spaces, to give a new proof of the known, two-dimensional case of the Kakeya conjecture. This theorem of geometric measure theory, proved by Davies in 1971, says that every plane set containing a unit line segment in every direction has Hausdorff dimension 2.
2. We use conditional Kolmogorov complexity in Euclidean spaces to develop the lower and upper conditional dimensions dim($x|y$) and Dim($x|y$) of $x$ given $y$, where $x$ and $y$ are points in Euclidean spaces. Intuitively these are the lower and upper asymptotic algorithmic information densities of $x$ conditioned on the information in $y$. We prove that these conditional dimensions are robust and that they have the correct information-theoretic relationships with the well studied dimensions dim($x$) and Dim($x$) and mutual dimensions mdim($x:y$) and Mdim($x:y$).
Joint work with Jack Lutz.
In this talk, we will survey recent work on using random linear projections, a.k.a. sketches, to analyze massive graphs. Sketches are useful in a variety of computational models including the dynamic graph stream model were the input is defined by a stream of edge insertions and deletions that need to be processed in small space. A large number of problems have now been studied in this model including edge and vertex connectivity, spectral sparsification, triangle counting, densest subgraph, correlation clustering, vertex cover, and matching.
The Stochastic Block Model or the Planted Partition Model is the most widely used probabilistic model for community detection and clustering graphs in various fields, including machine learning, statistics, and social sciences. Many existing algorithms successfully learn the communities or clusters when the data is drawn exactly according to the model, but they do not work well in the presence of errors. In this talk, I will address the following question: Can we design robust polynomial time algorithms for learning probabilistic models for community detection that work in the presence of adversarial modelling errors?
I will present robust algorithms for (partially) recovering communities or clusters in graphs drawn from the Stochastic Block Model, in the presence of modeling errors or noise. These algorithms allow two types of adversarial errors: edge outlier errors (where an adversary can corrupt an arbitrary $\epsilon$ fraction of the edges), and any amount of Feige-Kilian or monotone errors. Mossel, Neeman and Sly (STOC 2015) posed an open question about whether an almost exact recovery is possible when the adversary is allowed to add $o(n)$ edges. Our work answers this question affirmatively even in the case of $k>2$ communities.
Finally, I will describe how our algorithms recover the clusters, even when the modelling error is captured using Kullback-Leibler (KL) divergence: these algorithms work when the instances come from any distribution of graphs that is $\epsilon m$ close to the Stochastic Block Model in the KL divergence (this result also handles adversarial errors).
This is based on joint work with Konstantin Makarchev and Yury Makarychev.
Many applications require recovering a ground truth low-rank matrix from noisy observations of the entries. In practice, this is typically formulated as a weighted low-rank approximation problem and solved using non-convex optimization heuristics such as alternating minimization. Such non-convex techniques have few guarantees. Even worse, weighted low-rank approximation is NP-hard for even the most simple case when the ground truth is a rank-1 matrix.
Under moderate assumptions on the weight matrix and the ground truth, we prove that the vanilla alternating minimization algorithm with a simple and cheap "clipping" step, which zeroes out rows with high $l_2$ norms in the intermediate solutions, recovers the ground truth. In particular, we bound the spectral norm of the difference between the recovered matrix and the ground truth, by the spectral norm of the weighted noise plus an additive error term that decreases exponentially with the number of rounds of alternating minimization. This provides the first recovery guarantee for weighted low-rank approximation via alternating minimization with non-binary deterministic weights. It is a significant generalization of the results for matrix completion, the special case with binary weights, since our assumptions are similar or weaker than those made in existing works. Additionally, our proof applies for random initialization, unlike prior approaches that typically require an SVD-based initialization.
Consider the following algorithmic problem: You are given a degree-d real polynomial over $n$ variables $x_1,...,x_n$. You would like to (approximately) compute the fraction of points in the Boolean hypercube $\{0,1\}^n$ on which the polynomial takes a non-negative value. (Equivalently, you are given a degree-$d$ polynomial threshold function, and you would like to approximately count its satisfying assignments.)
This problem is trivial to solve using random sampling...but what if you are not allowed to use randomness? In this talk we describe an efficient deterministic algorithm for this approximate counting problem. The algorithm employs many ingredients including invariance principles, new central limit theorems for low-degree polynomials, and new structural decompositions of such polynomials. No prior background will be assumed for the talk.
Joint work with Anindya De.
Probabilistic modeling of ranking data is an extensively studied problem with applications ranging from understanding user preferences in electoral systems and social choice theory, to more modern learning tasks in online web search, crowd-sourcing and recommendation systems. This work concerns learning the Mallows model -- one of the most popular probabilistic models for analyzing ranking data. In this model, the user's preference ranking is generated as a noisy version of an unknown central base ranking. The learning task is to recover the base ranking and the model parameters using access to noisy rankings generated from the model.
Although well understood in the setting of a homogeneous population (a single base ranking), the case of a heterogeneous population (mixture of multiple base rankings) has so far resisted algorithms with guarantees on worst case instances. In this talk I will present the a polynomial time algorithm which provably learns the parameters and the unknown base rankings of a mixture of two Mallows models.
Joint work with Avrim Blum, Or Sheffet and Aravindan Vijayaraghavan
I'll describe a fast and simple algorithm for the contextual bandit learning problem, where the learner repeatedly takes an action in response to the observed context, observing the reward only for that action. The algorithm assumes access to an oracle for solving cost-sensitive classification problems and achieves the statistically optimal regret guarantee with only $\tilde{O}(\sqrt{T})$ oracle calls across all $T$ rounds. Joint work with Alekh Agarwal, Satyen Kale, John Langford, Lihong Li, and Rob Schapire.