CS 514: Advanced Algorithms II (Sublinear Algorithms) -- Spring 2020


Instructor Sepehr Assadi
Credits 3 units
Schedule Tuesdays 12:00 PM - 3:00 PM in TIL-116 (Livingston campus)
Prerequisites Undergraduate courses on algorithms, complexity theory, discrete mathematics, and probability; mathematical maturity.
Syllabus The full course syllabus is available here. This webpage contains the highlights of course syllabus that are potentially updated as the semester progresses.

Overview

With the emergence of massive datasets across different application domains, there is a rapidly growing interest in solving various problems over immense amounts of data. However, even most basic algorithms can become computationally prohibitive when processing massive datasets as the inputs are often too large to be stored in one place or read even once. As a result, a new set of algorithmic tools and ideas are needed for computing with exteremly constrained resources. This is the focus of sublinear algorithms, namely, algorithms whose resource requirements (e.g. time or space) are substantially smaller than the size of the input that they operate on.

We will study various advanced algorithmic ideas through the lens of sublinear algorithms in this course. In particular, we consider two most canonical models of sublinear algorithms, namely, sublinear time algorithms and streaming algorithms, and cover several key algorithmic techniques in these (and related) models, as well as discuss limitations inherent to computing with constrained resources.

Logistics

Important Update: Following Rutgers's response to COVID-19 situation, all in-person instructions for this course are suspended and we will continue our course through online lectures and online meetings for office hours. Please see the Canvas page for the course for more information (send me an email if you do not have access to Canvas).
This course has no recitation sections.

List of Topics

The following is a tentative list of topics that will be covered in this course. Along the way, we will learn about various key ideas such as probabilistic analysis of algorithms, compressed sensing, dimensionality reduction, sparsification, sketching, coresets, etc. that are used extensively in algorithm design as a whole and sublinear algorithms in particular.

Grading

The final grade for the course will be based on the following weights: Students are expected to follow Rutgers academic integrity policy for all their work in this course. See the course syllabus for more information.

Problem sets: There will be three two problem sets in the course and a tentative schedule of release and due dates are available on the course calendar. Solutions must be typeset in LaTeX and submitted via Canvas by 11:59pm EST on Tuesday the problem set is due. Problem sets can (and probably should) be done in teams of up to three students. However, (1) the students should write their solutions completely independently (in particular, you should understand and be able to explain everything that is written in your solution); (2) you should include the name of your collaborators in your solutions.

Update: Due to the recent changes in the course to account for the COVID-19 situation, we are going to only have two problem sets. Instead, each lecture will also contain a single practice problem on the topics of the lecture. You do not need to turn in a solution for these practice problems.

Project: There is a final project that will consist of exploring a topic of interest related to this course. This particularly involves reading one or two recent research papers in complete details to get a sense of the background on a research problem and then exploring ideas for addressing this problem. More details on the project will be released later in the semester in the Project section.

Scribe notes and participation: For each lecture, there will be one team (of one or two students) in charge of taking detailed notes, typing them in LaTeX, preparing any needed figures, and sending them to the Instructor by 11:59pm on Friday after the lecture to be posted on the course website. When preparing scribe notes, please use this LaTeX template -- just edit it to include your notes. Further instruction on scribing the notes is available in the template.

The course syllabus has further information about each of these assignments.

Course Calendar

The schedule below the red line is tentative and subject to change.

# Date Topics References Lecture notes and Remarks
1 Tue 01/21 Introduction, Course Policy, Probabilistic Analysis -- Lecture Notes 1
2 Tue 01/28 Sublinear Time Algorithms: Connected Components, Average Degree CRT05, F06, GR08, S15 Lecture Notes 2
3 Tue 02/04 Query Complexity: OR Function and Connectivity BW02 Lecture Notes 3 -- Pset 1 release: [PDF]
4 Tue 02/11 Property Testing: Testing Sortedness, Uniform Distribution Testing EKKRV98, BFRSW00 Lecture Notes 4
5 Tue 02/18 Local Computation Algorithms (LCA): Maximal Independent Set PR07, RTVX11 Lecture Notes 5
6 Tue 02/25 Compressed Sensing and Sparse Recovery BHRRS18, RSW18 Lecture Notes 6 -- Pset 1 due
7 Tue 03/03 Streaming Algorithms: Frequency Moments Estimation AMS96, BJKST02 Lecture Notes 7 -- Pset 2 release: [PDF]
8 Tue 03/10 Communication Complexity: Equality, Index A96, T16 Lecture Notes 8
Tue 03/17 No Class: Spring Recess
9 Tue 03/24 Streaming Algorithms: Regression via Dimensionality Reduction CW09 Lecture Notes 9 -- Pset 2 due
10 Tue 03/31 Streaming Algorithms: Clustering via Coresets GMMMO03, G09 Lecture Notes 10, [Practice Problem] -- Pset 2 due
11 Tue 04/07 Graph Streaming Algorithms: Connectivity, Shortest Paths, Coloring FKMSZ04, ACK19 [Practice Problem]
12 Tue 04/14 Detour: The Multiplicative Weight Update Method (MWU) AHK12
13 Tue 04/21 Multi-Pass Streaming Algorithms for Bipartite Matching via MWU AG11
14 Tue 04/28 Graph Sketching: AGM Sketch for Connectivity AGM12

Project

The project can take one of the following forms: A list of project ideas (including open theory problems and some directions to explore) will be posted sometime in March. However, you are strongly encouraged to approach the Instructor with any project idea you have on sublinear algorithms before this date to pick as your own project -- note that your project does not need to be limited to the topics discussed in class as long as it is loosly related to sublinear algorithms.

Project policies: Updated Project Expectations and Requirements Updated Timetable for Projects:

Resources

There is no official textbook for this course and all required materials will be posted on this webpage. The following is a list of some helpful supplementary materials (this list is by no means comprehensive): And last but not the least, you should definitely check the List of Open Problems in Sublinear Algorithms as one of the best places to get recent pointers on sublinear algorithms.

Bibliography

This is a (rather incomprehensive) list of the papers related to the topics discussed in the lectures. The list will be updated after each lecture to add the new relevant papers.

A96 Farid M. Ablayev, Lower Bounds for One-Way Probabilistic Communication Complexity and Their Application to Space Complexity. Theor. Comput. Sci. 1996, ICALP 1993.
AG11 Kook Jin Ahn, Sudipto Guha, Linear Programming in the Semi-streaming Model with Application to the Maximum Matching Problem. ICALP 2011.
AGM12 Kook Jin Ahn, Sudipto Guha, Andrew McGregor, Analyzing Graph Structure via Linear Measurements. SODA 2012.
AMS96 Noga Alon, Yossi Matias, and Mario Szegedy, The space complexity of approximating the frequency moments. STOC 1996.
AHK12 Sanjeev Arora, Elad Hazan, Satyen Kale, The Multiplicative Weights Update Method: a Meta-Algorithm and Applications. Theory of Computing 2012.
ACK19 Sepehr Assadi, Yu Chen, Sanjeev Khanna, Sublinear Algorithms for (Δ+1) Vertex Coloring. SODA 2019.
BFRSW00 Tugkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, Patrick White, Testing that distributions are close. FOCS 2000.
BJKST02 Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar, D. Sivakumar, Luca Trevisan, Counting Distinct Elements in a Data Stream. RANDOM 2002.
BHRRS18 Paul Beame, Sariel Har-Peled, Sivaramakrishnan Natarajan Ramamoorthy, Cyrus Rashtchian, Makrand Sinha, Edge Estimation with Independent Set Oracles. ITCS 2018.
BW02 Harry Buhrman, Ronald de Wolf, Complexity Measures and Decision Tree Complexity: A Survey. Theor. Comput. Sci., 2002.
CW09 Kenneth L. Clarkson, David P. Woodruff, Numerical Linear Algebra in the Streaming Model. STOC 2009.
CRT05 Bernard Chazelle, Ronitt Rubinfeld, Luca Trevisan, Approximating the Minimum Spanning Tree Weight in Sublinear Time. SIAM Journal of Computing 2005, ICALP 2001.
EKKRV98 Funda Ergün, Sampath Kannan, Ravi Kumar, Ronitt Rubinfeld, Mahesh Viswanathan, Spot-Checkers. STOC 1998.
F06 Uriel Feige, On Sums of Independent Random Variables with Unbounded Variance and Estimating the Average Degree in a Graph. SIAM Journal of Computing 2006, STOC 2004.
FKMSZ04 Joan Feigenbaum, Sampath Kannan, Andrew McGregor, Siddharth Suri, Jian Zhang On Graph Problems in a Semi-streaming Model. ICALP 2004.
G09 Sudipto Guha, Tight Results for Clustering and Summarizing Data Streams. ICDT 2009.
GMMMO03 Sudipto Guha, Adam Meyerson, Nina Mishra, Rajeev Motwani, Liadan O'Callaghan, Clustering Data Streams: Theory and Practice. IEEE Trans. Knowl. Data Eng. 2003, FOCS 2000.
GR08 Oded Goldreich, Dana Ron, Approximating average parameters of graphs. Random Structures and Algorithms 2006, APPROX-RANDOM 2006.
PR07 Michal Parnas, Dana Ron, Approximating the Minimum Vertex Cover in Sublinear Time and a Connection to Distributed Algorithms. Theor. Comput. Sci., 2007.
RSW18 Aviad Rubinstein, Tselil Schramm, S. Matthew Weinberg, Computing Exact Minimum Cuts Without Knowing the Graph. ITCS 2018.
RTVX11 Ronitt Rubinfeld, Gil Tamir, Shai Vardi, Ning Xie, Fast Local Computation Algorithms. I(T)CS 2011.
S15 C. Seshadhri, A simpler sublinear algorithm for approximating the triangle count. available on arXiv.
T16 Tim Roughgarden, Communication Complexity (for Algorithm Designers). Foundations and Trends in Theoretical Computer Science 2016.

LaTeX

You can download LaTeX for free here. For the purpose of this course, you do not even need to install LaTeX and can instead use an online LaTeX editor such as Overleaf.

Two great introductory resources for LaTeX are A Short Introduction to LaTeX by Allin Cottrell (for general purpose LaTeX) and LaTeX for Undergraduates by Jim Hefferson (for undergraduates mathematics) accompanied by the following cheatsheet (note that this document use "\( MATH \)" notation compared to the perhaps more widely used "$ MATH $" -- both are completely fine in LaTeX). You can also use this wonderful tool Detexify by Daniel Kirsch for finding the LaTeX commands of a symbol (just draw the symbol!).

If you are interested in learning more about LaTeX (beyond what is needed for this course), check the Wikibook on LaTeX and the Wikibook on LaTeX for Mathematics.