« search calendars

« DIMACS/TRIPODS Workshop on Optimization and Machine Learning

DIMACS/TRIPODS Workshop on Optimization and Machine Learning

August 13, 2018 - August 15, 2018

Location:

Iacocca Hall

Lehigh University

Bethlehem PA

Click here for map.

Organizer(s):

Frank Curtis, Lehigh University

Stefanie Jegelka, Massachusetts Institute of Technology

Satyen Kale, Google

Edo Liberty, Amazon

Han Liu, Northwestern University

Francesco Orabona, Stony Brook University

Katya Scheinberg, Lehigh University

Martin Takáč, Lehigh University

The DIMACS/TRIPODS workshop on Optimization in Machine Learning is being held at Lehigh University in association with two related events. It is immediately preceded by a three-day summer school for doctoral students interested in improving their theoretical and practical skills related to optimization approaches in machine learning. The workshop is followed by the annual Modeling and Optimization: Theory and Applications (MOPTA) conference, which covers related topics but is much broader in scope.

Although machine learning (ML) is still a relatively new field, it is growing at an astounding rate, and ML models are a driving force in optimization today. While some ML models—notably support vector machines and logistic regression—yield to more traditional convex optimization approaches, others—using deep neural networks—lead to nonlinear and nonconvex problems that stymie traditional solvers. This workshop will address several of the biggest optimization challenges that arise in practical ML applications and be organized around three key themes: 1) stochastic and nonconvex methods; 2) adaptive and online methods for learning under uncertainty; and 3) finding and exploiting special structure.

 (1) Nonconvex and stochastic optimization methods. The optimization effort in ML applications has grown exponentially in the recent past, because of the difficulty and importance of training complex nonconvex models, such as deep neural networks. Stochastic methods have significant advantage on these problems because computing accurate function and gradient information required by deterministic methods may be prohibitive for very large data sets. At the same time, it is difficult to construct useful second order information based on stochastic estimates, and hence stochastic methods with fast convergence are still out of reach for training deep neural nets. On the other hand, exploiting second order information seems necessary for deep neural networks to avoid areas of saddle points, which terminate training with inferior solutions. Significant progress has been made recently for stochastic second-order methods and variance-reduction methods for convex models, as well as fast second-order deterministic methods for deep neural networks. Second-order methods also require advanced use of matrix approximation and sketching techniques to run efficiently. These methods are currently being tested and extended to stochastic nonconvex settings. Their promise makes them a timely and important workshop topic.

(2) Methods for Learning Under Uncertainty. Changing data distributions is a commonly encountered scenario in machine learning, especially when data are generated over time (as in spam filtering). Adaptive and online learning techniques able to handle changing data distributions are thus paramount in such situations. A recent trend is online learning methods that adapt to “easiness” in the data. This line of work provides algorithms that are able exploit structure in the data (if it exists) in order to learn faster, while retaining optimal worst-case convergence rates. One of the aims of this workshop is to investigate what notions of easiness are amenable to these sorts of algorithms. Another active area of research in adaptive online learning deals with methods that simultaneously provide competitive guarantees with predictors of arbitrary complexity. These guarantees naturally degrade gracefully with increasing complexity. These methods are intimately connected with optimization methods and convex analysis. Beyond their performance guarantees, such algorithms also tend to be parameter-free, making them immensely practical because they require no tuning. One aim of the workshop will be to gain insight on the optimal competitive guarantees that can be obtained in various scenarios of learning with unbounded complexity predictors.

(3) Special structure for optimization in machine learning. The identification of convexity in machine learning problems led to a surge in machine learning methods and related optimization algorithms, with broad impact in theory and applications. Many of these rely on optimality guarantees and scalability of the optimization. With the rise of deep learning and other complex models, and concomitant questions in computational versus statistical complexity, there is increasing interest in understanding the properties of nonconvex learning formulations. For example, in some cases nonconvex formulations may be statistically competitive if not. Still, the current theoretical understanding of empirical results for nonconvex learning is very limited and in need of deeper insight.

Scalability and guarantees for (nonconvex) optimization rely on additional mathematical structure. The workshop will explore types of structure that aid optimization and their potential impact on machine learning problems. Potential topics include geometric optimization, polynomial optimization, relaxations, restricted and generalized versions of convexity, and blending of discrete and continuous optimization. Examples of the latter are optimization problems arising from learning representations of discrete objects, nonconvex relaxations for discrete optimization, and exploiting structure from discrete optimization for nonconvex optimization. Recent examples in machine learning include results for matrix factorization and other recovery problems or submodular optimization.

 

Monday, August 13, 2018

8:45 AM - 9:00 AM

Opening remarks

9:00 AM - 10:00 AM

Plenary talk: Tractable Nonconvex Optimization via Geometry

Suvrit Sra, Massachusetts Institute of Technology

10:00 AM - 10:30 AM
10:30 AM - 11:00 AM
11:00 AM - 11:30 AM

Break

11:30 AM - 12:00 PM

Building Algorithms by Playing Games

Jake Abernethy, University of Michigan

12:00 PM - 12:30 PM
12:30 PM - 1:30 PM

Lunch

1:30 PM - 2:30 PM
2:30 PM - 3:00 PM
3:00 PM - 3:30 PM

Data-dependent Hashing via Nonlinear Spectral Gaps

Alexandr Andoni, Columbia University

3:30 PM - 4:00 PM

Break

4:00 PM - 4:30 PM

Stochastic Optimization for AUC Maximization

Yiming Ying, University at Albany

4:30 PM - 5:00 PM
5:00 PM - 5:30 PM

Contextual Reinforcement Learning

John Langford, Microsoft Research

6:30 PM - 8:00 PM

Social time at the Comfort Suites Bar

 

Tuesday, August 14, 2018

9:00 AM - 9:30 AM
9:30 AM - 10:00 AM

Robustness and Submodularity

Stefanie Jegelka, Massachusetts Institute of Technology

10:00 AM - 10:30 AM
10:30 AM - 11:00 AM

Break

11:00 AM - 11:30 AM
11:30 AM - 12:00 PM
12:00 PM - 12:30 PM

Direct Runge-Kutta Discretization Achieves Acceleration

Aryan Mokhtari, Massachusetts Institute of Technology

12:30 PM - 1:30 PM

Lunch

1:30 PM - 2:30 PM

Plenary talk: Better Models in Optimization

John Duchi, Stanford University

2:30 PM - 3:00 PM

Frank-Wolfe Splitting via Augmented Lagrangian Method

Simon Lacoste-Julien, University of Montreal

3:00 PM - 3:30 PM

Second Order Optimization and Non-convex Machine Learning

Michael Mahoney, University of California, Berkeley

3:30 PM - 4:00 PM

Break

4:00 PM - 4:30 PM

Optimization over Nonnegative Polynomials

Amir Ali Ahmadi, Princeton University

4:30 PM - 5:00 PM
5:00 PM - 7:00 PM
 

Wednesday, August 15, 2018

9:00 AM - 10:00 AM

Plenary talk: Deep Learning with Dense Connectivity

Kilian Weinberger, Cornell University

10:00 AM - 10:15 AM

Break

10:15 AM - 10:45 AM

A Positive Outlook on Negative Curvature

Daniel Robinson, Johns Hopkins University

10:45 AM - 11:15 AM
11:15 AM - 11:45 AM

Convex Lens for Non-convex Problems

Benjamin Haeffele, Johns Hopkins University

11:45 AM - 12:00 PM

Break

12:00 PM - 1:00 PM

Plenary talk: Practical Conditional Gradient Algorithms

Stephen Wright, University of Wisconsin, Madison

1:00 PM - 2:00 PM

Lunch

2:00 PM - 2:30 PM
2:30 PM - 3:00 PM
3:00 PM - 3:30 PM
3:30 PM - 4:00 PM

Break

4:00 PM - 4:30 PM
4:30 PM - 5:00 PM
5:00 PM - 5:30 PM
 

For the most up-to-date information on registration and participation, please see the workshop's primary webpage.

 

The workshop will feature plenary presentations each morning, two parallel sessions of invited presentations each afternoon, as well a contributed poster session and reception on August 14.

Registration for this event is closed.