SI151: Optimization and Machine Learning

Yuanming Shi, ShanghaiTech University, Spring 2018

Description

This course provides a broad introduction to machine learning, statistical learning and deep learning, with particular emphasis on learning models, optimization algorithms and statistical analysis. Topics include: supervised learning (e.g., generative learning, parametric and nonparametric learning, regression, classification, support vector machines, neural networks); unsupervised learning (e.g., clustering, dimensionality reduction, kernel methods, density estimation); statistical learning theory (bias and variance tradeoffs; VC theory; large margins). This course will also introduce optimization methods (e.g., gradient methods, proximal methods, quasi-Newton methods, stochastic and randomized algorithms) that are suitable for large-scale problems arising in machine learning applications.

Textbooks and Optional References

Textbooks:

Learning from Data, by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, AMLBook New York, 2012.
Convex Optimization, by S. Boyd and L. Vandenberghe, Cambridge University Press, 2003.

References:

Pattern Recognition and Machine Learning, by C. M. Bishop, Springer, 2007.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction, by T. Hastie, R. Tibshirani, and J. Friedman, Springer, 2009.
Deep Learning, by I. Goodfellow, Y. Bengio and A. Courville, MIT Press, 2016.
Convex Optimization: Algorithms and Complexity, by S. Bubeck, Foundations and Trends in Machine Learning, 2015.
First-order Methods in Optimization, by A. Beck, MOS-SIAM Series on Optimization, 2017.
Non-convex Optimization for Machine Learning, by P. Jain and P. Kark, Foundations and Trends in Machine Learning, 2017.

Lectures

Foundations
1. The learning problem
2. Training versus testing
3. The linear model
4. Overfitting
5. Three learning principles
Techniques
1. Similarity-based methods
2. Neural networks
3. Support vector machines
4. Learning aides
Optimization
1. Convex and nonconvex optimization
2. First-order optimization algorithms
3. Second-order optimization algorithms
4. Stochastic optimization algorithms