Counter Examples for Stochastic Gradient Descent

October 19, 2022, 11:50 AM - 12:50 PM

Location:

Hill Center, Room 552

Vivak Patel, University of Wisconsin, Madison

Stochastic Gradient Descent (SGD) is a widely deployed algorithm for solving estimation problems that arise in statistics and learning. Accordingly, SGD has been analyzed from many perspectives to understand its behavior and to ensure its reliability, especially from a global convergence/consistency perspective. Unfortunately, we will show through simple examples that existing global convergence analyses make unrealistic deterministic assumptions, which result in incorrect conclusions or the utilization of inappropriate techniques. To be specific, counter to existing results, we will construct a deterministic example under realistic assumptions for which Gradient Descent (GD) will diverge catastrophically. Then, counter to a popular technique, we will provide a deterministic example for which approximating GD with continuous GD leads to incorrect conclusions about GD. Turning to stochastic assumptions, we show that existing stochastic assumptions are unrealistic for simple machine learning and statistics problems. Thus, we highlight that GD and SGD do not have an appropriate theory for learning problems. Finally, we provide a result for the global convergence of GD and SGD that addresses this gap. 

Bio: Vivak Patel is an assistant professor of statistics at the University of Wisconsin -- Madison. Prior to joining the faculty at UW -- Madison, Vivak completed his doctorate in statistics at the University of Chicago, his master’s in mathematics at the University of Cambridge, and his Bachelor of Science n Applied Physics and Biomathematics at Rutgers University in New Brunswick. Vivak's research is at the intersection of uncertainty and computing. On the one hand, Vivak and his group analyze and improve computational tools and algorithms that are applied to problems with inherent uncertainty, such as learning and statistical estimation. On the other hand, Vivak and his group use statistical concepts and uncertainty to improve computational tools and algorithms for challenging problems. Vivak and his group apply their work to problems arising in statistics, machine learning, data  assimilation, differential equations, and control.

 

This seminar is also online presented via zoom: https://rutgers.zoom.us/j/99075124232?pwd=UDdPVjRncXZFcXpvbFE0OWJyMVdSUT09

Meeting ID: 99075124232
Password: 952486