Research papers
A list of publications can also be found in my CV  arXiv  MathSciNet  Google Scholar
Cayley graphs

Quasirandom Cayley graphs (with David Conlon)
abstract  arXiv  slides  Oberwolfach report
We prove that the properties of having small discrepancy and having small second eigenvalue are equivalent in Cayley graphs, extending a result of Kohayakawa, Rödl and Schacht, who treated the abelian case. The proof relies on a number of ingredients, including Grothendieck's inequality and nonabelian Fourier analysis. As a corollary, we also prove that a similar result holds in all vertextransitive graphs.
The GreenTao theorem and a relative Szemerédi theorem

A relative Szemerédi theorem (with David Conlon and Jacob Fox)
Geometric and Functional Analysis 25 (2015), 733–762.
abstract  arXiv  blog  Oberwolfach report
The celebrated GreenTao theorem states that there are arbitrarily long arithmetic progressions in the primes. One of the main ingredients in their proof is a relative Szemerédi theorem which says that any subset of a pseudorandom set of integers of positive relative density contains long arithmetic progressions.
In this paper, we give a simple proof of a strengthening of the relative Szemerédi theorem, showing that a much weaker pseudorandomness condition is sufficient. Our strengthened version can be applied to give the first relative Szemerédi theorem for \(k\)term arithmetic progressions in pseudorandom subsets of \(\mathbb{Z}_N\) of density \(N^{c_k}\).
The key component in our proof is an extension of the regularity method to sparse pseudorandom hypergraphs, which we believe to be interesting in its own right. From this we derive a relative extension of the hypergraph removal lemma. This is a strengthening of an earlier theorem used by Tao in his proof that the Gaussian primes contain arbitrarily shaped constellations and, by standard arguments, allows us to deduce the relative Szemerédi theorem.

An arithmetic transference proof of a relative Szemerédi theorem
Mathematical Proceedings of the Cambridge Philosophical Society 156 (2014), 255–261.
abstract  arXivRecently Conlon, Fox, and the author gave a new proof of a relative Szemerédi theorem, which was the main novel ingredient in the proof of the celebrated GreenTao theorem that the primes contain arbitrarily long arithmetic progressions. Roughly speaking, a relative Szemerédi theorem says that if $S$ is a set of integers satisfying certain conditions, and $A$ is a subset of $S$ with positive relative density, then $A$ contains long arithmetic progressions, and our recent results show that $S$ only needs to satisfy a socalled linear forms condition.
This note contains an alternative proof of the new relative Szemerédi theorem, where we directly transfer Szemerédi's theorem, instead of going through the hypergraph removal lemma. This approach provides a somewhat more direct route to establishing the result, and it gives better quantitative bounds.
The proof has three main ingredients: (1) a transference principle/dense model theorem of GreenTao and TaoZiegler (with simplified proofs given later by Gowers, and independently, ReingoldTrevisanTulsianiVadhan) applied with a discrepancy/cuttype norm (instead of a Gowers uniformity norm as it was applied in earlier works), (2) a counting lemma established by Conlon, Fox, and the author, and (3) Szemerédi's theorem as a black box.

A short proof of the multidimensional Szemerédi theorem in the primes (with
Jacob Fox)
American Journal of Mathematics 137 (2015), 1139—1145.
abstract  arXiv  blogTao conjectured that every dense subset of \(\mathcal{P}^d\), the \(d\)tuples of primes, contains constellations of any given shape. This was very recently proved by Cook, Magyar, and Titichetrakun and independently by Tao and Ziegler. Here we give a simple proof using the GreenTao theorem on linear equations in primes and the FurstenbergKatznelson multidimensional Szemerédi theorem.
Large deviations in random graphs

On replica symmetry of large deviations in random graphs (with Eyal Lubetzky)
Random Structures & Algorithms 47 (2015) 109–146.
abstract  arXiv  blog  Oberwolfach reportThe following question is due to Chatterjee and Varadhan (2011). Fix \( 0 \lt p \lt r \lt 1 \) and take \(G\sim \mathcal G(n,p)\), the ErdősRényi random graph with edge density \(p\), conditioned to have at least as many triangles as the typical \(\mathcal G(n,r)\). Is \(G\) close in cutdistance to a typical \(\mathcal G(n,r)\)? Via a beautiful new framework for large deviation principles in \(\mathcal G(n,p)\), Chatterjee and Varadhan gave bounds on the replica symmetric phase, the region of \((p,r)\) where the answer is positive. They further showed that for any small enough \(p\) there are at least two phase transitions as \(r\) varies.
We settle this question by identifying the replica symmetric phase for triangles and more generally for any fixed \(d\)regular graph. By analyzing the variational problem arising from the framework of Chatterjee and Varadhan we show that the replica symmetry phase consists of all \((p,r)\) such that \((r^d,h_p(r))\) lies on the convex minorant of \(x\mapsto h_p(x^{1/d})\) where \(h_p\) is the rate function of a binomial with parameter \(p\). In particular, the answer for triangles involves \(h_p(\sqrt{x})\) rather than the natural guess of \(h_p(x^{1/3})\) where symmetry was previously known. Analogous results are obtained for linear hypergraphs as well as the setting where the largest eigenvalue of \(G\sim\mathcal G(n,p)\) is conditioned to exceed the typical value of the largest eigenvalue of \(\mathcal G(n,r)\). Building on the work of Chatterjee and Diaconis (2012) we obtain additional results on a class of exponential random graphs including a new range of parameters where symmetry breaking occurs. En route we give a short alternative proof of a graph homomorphism inequality due to Kahn (2001) and Galvin and Tetali (2004).

On the variational problem for upper tails in sparse random graphs (with Eyal Lubetzky)
Random Structures & Algorithms, to appear.
abstract  arXiv  blogWhat is the probability that the number of triangles in $\mathcal{G}_{n,p}$, the ErdősRényi random graph with edge density $p$, is at least twice its mean? Writing it as $\exp[ r(n,p)]$, already the order of the rate function $r(n,p)$ was a longstanding open problem when $p=o(1)$, finally settled in 2012 by Chatterjee and by DeMarco and Kahn, who independently showed that $r(n,p)\asymp n^2p^2 \log (1/p)$ for $p \gtrsim \frac{\log n}n$; the exact asymptotics of $r(n,p)$ remained unknown.
The following variational problem can be related to this large deviation question at $p\gtrsim \frac{\log n}n$: for $\delta>0$ fixed, what is the minimum asymptotic $p$relative entropy of a weighted graph on $n$ vertices with triangle density at least $(1+\delta)p^3$? A beautiful large deviation framework of Chatterjee and Varadhan (2011) reduces upper tails for triangles to a limiting version of this problem for fixed $p$. A very recent breakthrough of Chatterjee and Dembo extended its validity to $n^{\alpha}\ll p \ll 1$ for an explicit $\alpha>0$, and plausibly it holds in all of the above sparse regime.
In this note we show that the solution to the variational problem is $\min\{\frac12 \delta^{2/3}\,,\, \frac13 \delta\}$ when $n^{1/2}\ll p \ll 1$ vs. $\frac12 \delta^{2/3}$ when $n^{1} \ll p\ll n^{1/2}$ (the transition between these regimes is expressed in the count of triangles minus an edge in the minimizer). From the results of Chatterjee and Dembo, this shows for instance that the probability that $\mathcal{G}_{n,p}$ for $ n^{\alpha} \leq p \ll 1$ has twice as many triangles as its expectation is $\exp[r(n,p)]$ where $r(n,p)\sim \frac13 n^2 p^2\log(1/p)$. Our results further extend to $k$cliques for any fixed $k$, as well as give the order of the upper tail rate function for an arbitrary fixed subgraph when $p\geq n^{\alpha}$.

On the lower tail variational problem for random graphs
Combinatorics, Probability and Computing, to appear.
abstract  arXivWe study the lower tail large deviation problem for subgraph counts in a random graph. Let $X_H$ denote the number of copies of $H$ in an ErdősRényi random graph $\mathcal{G}(n,p)$. We are interested in estimating the lower tail probability $\mathbb{P}(X_H \le (1\delta) \mathbb{E} X_H)$ for fixed $0 < \delta < 1$.
Thanks to the results of Chatterjee, Dembo, and Varadhan, this large deviation problem has been reduced to a natural variational problem over graphons, at least for $p \ge n^{\alpha_H}$ (and conjecturally for a larger range of $p$). We study this variational problem and provide a partial characterization of the socalled “replica symmetric” phase. Informally, our main result says that for every $H$, and $0 < \delta < \delta_H$ for some $\delta_H > 0$, as $p \to 0$ slowly, the main contribution to the lower tail probability comes from ErdősRényi random graphs with a uniformly tilted edge density. On the other hand, this is false for nonbipartite $H$ and $\delta$ close to 1.

Upper tails and independence polynomials in random graphs (with
Bhaswar B. Bhattacharya,
Shirshendu Ganguly, and Eyal Lubetzky)
abstract  arXivThe upper tail problem in the ErdősRényi random graph $G\sim\mathcal{G}_{n,p}$ asks to estimate the probability that the number of copies of a graph $H$ in $G$ exceeds its expectation by a factor $1+\delta$. Chatterjee and Dembo (2014) showed that in the sparse regime of $p\to 0$ as $n\to\infty$ with $p \geq n^{\alpha}$ for an explicit $\alpha=\alpha_H>0$, this problem reduces to a natural variational problem on weighted graphs, which was thereafter asymptotically solved by two of the authors in the case where $H$ is a clique.
Here we extend the latter work to any fixed graph $H$ and determine a function $c_H(\delta)$ such that, for $p$ as above and any fixed $\delta>0$, the upper tail probability is $\exp[(c_H(\delta)+o(1))n^2 p^\Delta \log(1/p)]$, where $\Delta$ is the maximum degree of $H$. As it turns out, the leading order constant in the large deviation rate function, $c_H(\delta)$, is governed by the independence polynomial of $H$, defined as $P_H(x)=\sum i_H(k) x^k $ where $i_H(k)$ is the number of independent sets of size $k$ in $H$. For instance, if $H$ is a regular graph on $m$ vertices, then $c_H(\delta)$ is the minimum between $\frac12 \delta^{2/m}$ and the unique positive solution of $P_H(x) = 1+\delta$. 
Upper tails for arithmetic progressions in a random set (with
Bhaswar B. Bhattacharya,
Shirshendu Ganguly, and Xuancheng Shao)
abstract  arXivLet $X_k$ denote the number of $k$term arithmetic progressions in a random subset of $\mathbb{Z}/N\mathbb{Z}$ or $\{1, \dots, N\}$ where every element is included independently with probability $p$. We determine, in certain ranges of parameters, the asymptotics of $\log \mathbb{P}(X_k \ge (1+\delta) \mathbb{E} X_k)$. Recently, Chatterjee and Dembo showed that for $p=p_N\to 0$ sufficiently slowly, this problem, for $k=3$, reduces to a natural variational problem. We solve this variational problem asymptotically, thereby obtaining the large deviation rate. More generally, for $k \ge 4$, using the inverse theorem for the Gowers norms, we determine the rate as long as $p \to 0$ extremely slowly. We solve the variational problem by reducing it to the combinatorial problem of maximizing the number of $k$APs in a set of given size, which was solved by Green and Sisask when $k = 3$, and extended to all $k$ here. Our results complement those of Warnke, who used completely different methods to estimate, for the full range of $p$, the large deviation rate up to a constant factor.
Graph limits

Hypergraph limits: a regularity approach
Random Structures & Algorithms 47 (2015), 205–226.
abstract  arXivA sequence of \(k\)uniform hypergraphs \(H_1, H_2, \dots\) is convergent if the sequence of homomorphism densities \(t(F, H_1), t(F, H_2), \dots\) converges for every \(k\)uniform hypergraph \(F\). For graphs, Lováasz and Szegedy showed that every convergent sequence has a limit in the form of a symmetric measurable function \(W \colon [0,1]^2 \to [0,1]\). For hypergraphs, analogous limits \(W \colon [0,1]^{2^k2} \to [0,1]\) were constructed by Elek and Szegedy using ultraproducts. These limits had also been studied earlier by Hoover, Aldous, and Kallenberg in the setting of exchangeable random arrays.
In this paper, we give a new proof and construction of hypergraph limits. Our approach is inspired by the original approach of Lovász and Szegedy, with the key ingredient being a weak FriezeKannan type regularity lemma.

An $L^p$ theory of sparse graph convergence I: limits, sparse random graph models, and power law distributions (with Christian Borgs,
Jennifer T. Chayes, and Henry Cohn)
abstract  arXivWe introduce and develop a theory of limits for sequences of sparse graphs based on $L^p$ graphons, which generalizes both the existing $L^\infty$ theory of dense graph limits and its extension by Bollobáas and Riordan to sparse graphs without dense spots. In doing so, we replace the no dense spots hypothesis with weaker assumptions, which allow us to analyze graphs with power law degree distributions. This gives the first broadly applicable limit theory for sparse graphs with unbounded average degrees. In this paper, we lay the foundations of the $L^p$ theory of graphons, characterize convergence, and develop corresponding random graph models, while we prove the equivalence of several alternative metrics in a companion paper.

An $L^p$ theory of sparse graph convergence II: LD convergence, quotients, and right convergence (with Christian Borgs,
Jennifer T. Chayes, and Henry Cohn)
abstract  arXivWe extend the $L^p$ theory of sparse graph limits, which was introduced in a companion paper, by analyzing different notions of convergence. Under suitable restrictions on node weights, we prove the equivalence of metric convergence, quotient convergence, microcanonical ground state energy convergence, microcanonical free energy convergence, and large deviation convergence. Our theorems extend the broad applicability of dense graph convergence to all sparse graphs with unbounded average degree, while the proofs require new techniques based on uniform upper regularity. Examples to which our theory applies include stochastic block models, power law graphs, and sparse versions of $W$random graphs.

On derivatives of graphon parameters (with László Miklós Lovász)
Journal of Combinatorial Theory, Series A 145 (2017) 364–368.
abstract  arXivWe give a short elementary proof of the main theorem in the paper "Differential calculus on graph on space" by Diao et al. (JCTA 2015), which says that any graphon parameters whose $(N+1)$th derivatives all vanish must be a linear combination of homomorphism densities $t(H, )$ over graphs $H$ on at most $N$ edges.
Graph regularity lemma and applications

On regularity lemmas and their algorithmic applications (with Jacob Fox and László Miklós Lovász)
abstract  arXivSzemerédi's regularity lemma and its variants are some of the most powerful tools in combinatorics. In this paper, we establish several results around the regularity lemma. First, we prove that whether or not we include the condition that the desired vertex partition in the regularity lemma is equitable has a minimal effect on the number of parts of the partition. Second, we use an algorithmic version of the (weak) Frieze—Kannan regularity lemma to give a substantially faster deterministic approximation algorithm for counting subgraphs in a graph. Previously, only an exponential dependence for the running time on the error parameter was known, and we improve it to a polynomial dependence. Third, we revisit the problem of finding an algorithmic regularity lemma, giving approximation algorithms for several coNPcomplete problems. We show how to use the weak Frieze—Kannan regularity lemma to approximate the regularity of a pair of vertex subsets. We also show how to quickly find, for each $\epsilon' > \epsilon$, an $\epsilon'$regular partition with $k$ parts if there exists an $\epsilon$regular partition with $k$ parts. Finally, we give a simple proof of the permutation regularity lemma which improves the towertype bound on the number of parts in the previous proofs to a single exponential bound.
Sphere packing and energy minimization

Sphere packing bounds via spherical codes (with Henry Cohn)
Duke Mathematical Journal 163 (2014), 1965–2002.
abstract  arXiv  blogThe sphere packing problem asks for the greatest density of a packing of congruent balls in Euclidean space. The current best upper bound in all sufficiently high dimensions is due to Kabatiansky and Levenshtein in 1978. We revisit their argument and improve their bound by a constant factor using a simple geometric argument, and we extend the argument to packings in hyperbolic space, for which it gives an exponential improvement over the previously known bounds. Additionally, we show that the CohnElkies linear programming bound is always at least as strong as the KabatianskyLevenshtein bound; this result is analogous to Rodemich's theorem in coding theory. Finally, we develop hyperbolic linear programming bounds and prove the analogue of Rodemich's theorem there as well.

Energyminimizing errorcorrecting codes (with Henry Cohn)
IEEE Transactions on Information Theory 60 (2014), 7442–7450.
abstract  arXiv  blogWe study a discrete model of repelling particles, and we show using linear programming bounds that many familiar families of errorcorrecting codes minimize a broad class of potential energies when compared with all other codes of the same size and block length. Examples of these universally optimal codes include Hamming, Golay, and ReedSolomon codes, among many others, and this helps explain their robustness as the channel model varies. Universal optimality of these codes is equivalent to minimality of their binomial moments, which has been proved in many cases by Ashikhmin and Barg. We highlight connections with mathematical physics and the analogy between these results and previous work by Cohn and Kumar in the continuous setting, and we develop a framework for optimizing the linear programming bounds. Furthermore, we show that if these bounds prove a code is universally optimal, then the code remains universally optimal even if one codeword is removed.
Extremal and Ramsey graph theory

Extremal results in sparse pseudorandom graphs (with David Conlon and Jacob Fox)
Advances in Mathematics 256 (2014), 206–290.
abstract  arXiv  slides  blogSzemerédi’s regularity lemma is a fundamental tool in extremal combinatorics. However, the original version is only helpful in studying dense graphs. In the 1990s, Kohayakawa and Rödl proved an analogue of Szemerédi’s regularity lemma for sparse graphs as part of a general program toward extending extremal results to sparse graphs. Many of the key applications of Szemerédi’s regularity lemma use an associated counting lemma. In order to prove extensions of these results which also apply to sparse graphs, it remained a wellknown open problem to prove a counting lemma in sparse graphs.
The main advance of this paper lies in a new counting lemma, proved following the functional approach of Gowers, which complements the sparse regularity lemma of Kohayakawa and Rödl, allowing us to count small graphs in regular subgraphs of a sufficiently pseudorandom graph. We use this to prove sparse extensions of several wellknown combinatorial theorems, including the removal lemmas for graphs and groups, the ErdősStoneSimonovits theorem and Ramsey’s theorem. These results extend and improve upon a substantial body of previous work.

The critical window for the classical RamseyTurán problem (with Jacob Fox and PoShen Loh)
Combinatorica 35 (2015) 435—476.
abstract  arXiv  blogIn the decades since its introduction, Szemerédi’s regularity lemma has been widely adopted by the combinatorial community as a powerful microscope for studying the asymptotic regimes of extremal problems. Yet this power comes at the cost of limited resolution outside of the very far asymptotic regime, as the regularity lemma’s quantitative bounds necessarily involve towertype dependencies. We investigate the very first application of Szemerédi’s regularity lemma, which was the following celebrated RamseyTurán result proved by Szemerédi in 1972: any \(K_4\)free graph on \(n\) vertices with independence number \(o(n)\) has at most \((1/8+o(1))n^2\) edges. Four years later, Bollobás and Erdős gave a surprising geometric construction, utilizing the isoperimetric inequality for the high dimensional sphere, of a \(K_4\)free graph on \(n\) vertices with independence number \(o(n)\) and \((1/8o(1))n^2\) edges.
Bollobás and Erdős asked to estimate the minimum possible independence number in the critical window, when the number of edges is about \(n^2/8\). This required a level of accuracy beyond the reach of regularitybased approaches, and remained one of the main open problems in this area, receiving considerable attention. In this paper, we develop new regularityfree methods which give rise to nearly bestpossible dependencies, and solve several longstanding open problems concerning this critical window. Along the way, we introduce a new twist on another influential combinatorial technique, known as dependent random choice, which produces substantially better bounds.
Undergraduate research
Independent sets and graph homomorphisms

The number of independent sets in a regular graph
Combinatorics, Probability and Computing 19 (2010), 315–320.
abstract  arXiv  posterWe show that the number of independent sets in an \(N\)vertex, \(d\)regular graph is at most \((2^{d+1}  1)^{N/(2d)}\) where the bound is sharp for a disjoint union of complete \(d\)regular bipartite graphs. This settles a conjecture of Alon (1991) and Kahn (2001).

The number of independent sets in a graph with small maximum degree (with David Galvin)
Graphs and Combinatorics 27 (2011), 177–186
abstract  arXivWe prove an upper bound for the number of independent sets in a graph with maximum degree at most 5. The bound for all graphs is conjectured by Kahn. 
The bipartite swapping trick on graph homomorphisms
SIAM Journal on Discrete Mathematics 25 (2011), 660–680.
abstract  arXiv  posterWe provide an upper bound to the number of graph homomorphisms from \(G\) to \(H\), where \(H\) is a fixed graph with certain properties, and \(G\) varies over all \(N\)vertex, \(d\)regular graphs. This result generalizes a recently resolved conjecture of Alon and Kahn on the number of independent sets. We build on the work of Galvin and Tetali, who studied the number of graph homomorphisms from \(G\) to \(H\) when \(H\) is bipartite. We also apply our techniques to graph colorings and stable set polytopes.
More sums than differences sets
slides
Constructing MSTD sets using bidirectional ballot sequences
Journal of Number Theory 130 (2010), 1212–1220.
abstract  arXiv  longer draftA more sums than differences (MSTD) set is a finite subset \(S\) of the integers such that \(S+S > SS\). We construct a new dense family of MSTD subsets of \(\{0, 1, 2, \dots, n1\}\). Our construction gives \(\Theta(2^n/n)\) MSTD sets, improving the previous best construction with \(\Omega(2^n/n^4)\) MSTD sets by Miller, Orosz, and Scheinerman. 
Counting MSTD sets in finite Abelian groups
Journal of Number Theory 130 (2010), 2308–2322.
abstract  arXivIn an abelian group \(G\), a more sums than differences (MSTD) set is a subset \(A\) of \(G\) such that \(A+A>AA\). We provide asymptotics for the number of MSTD sets in finite abelian groups, extending previous results of Nathanson. The proof contains an application of a recently resolved conjecture of Alon and Kahn on the number of independent sets in a regular graph. 
Sets characterized by missing sums and differences
Journal of Number Theory 131 (2011), 2107–2134.
abstract  arXiv  codeWhat is the probability that a random subset of \(\{0, 1, \dots, n\}\) has more pairwise sums than pairwise differences? What can we say about the structure of such sets? This paper addresses these questions as well as generalizations.
Other topics

Constructing numerical semigroups of a given genus
Semigroup Forum 80 (2010), 242–254.
abstract  arXivIt has been conjectured that the number of numerical semigroups of a given genus \(g\), considered as a sequence in \(g\), has Fibonaccilike properties. We propose a new approach to attacking this conjecture, and give an improved asymptotic lower bound. 
The coefficients of a truncated Fibonacci power series
Fibonacci Quarterly 46/47 (2009), 5355.
abstractWe show that the coefficients of \(\prod_{i=2}^n (1  x^{F_i})\) are all –1, 0, or 1.
Expositions and surveys

The GreenTao theorem: an exposition (with David Conlon and Jacob Fox)
EMS Surveys in Mathematical Sciences 1 (2014), 249–282.
abstract  arXiv
The celebrated GreenTao theorem states that the prime numbers contain arbitrarily long arithmetic progressions. We give an exposition of the proof, incorporating several simplifica tions that have been discovered since the original paper.

Extremal regular graphs: independent sets and graph homomorphisms
abstract  arXiv  blog
This survey concerns regular graphs that are extremal with respect to the number of independent sets, and more generally, graph homomorphisms. The main motivating problem is the following: in the family of $d$regular graphs, which graph $G$ maximize/minimize the quantity $i(G)^{1/v(G)}$, the number of independent sets in $G$ normalized exponentially by the size of $G$? What if $i(G)$ is replaced by some other graph parameter?
We review existing techniques, highlight some exciting recent developments, and discuss open problems and conjectures for future research.