SI151: Optimization and Machine Learning

Yuanming Shi, ShanghaiTech University, Spring 2018


This course provides a broad introduction to machine learning, statistical learning and deep learning, with particular emphasis on learning models, optimization algorithms and statistical analysis. Topics include: supervised learning (e.g., generative learning, parametric and nonparametric learning, regression, classification, support vector machines, neural networks); unsupervised learning (e.g., clustering, dimensionality reduction, kernel methods, density estimation); statistical learning theory (bias and variance tradeoffs; VC theory; large margins). This course will also introduce optimization methods (e.g., gradient methods, proximal methods, quasi-Newton methods, stochastic and randomized algorithms) that are suitable for large-scale problems arising in machine learning applications.

Textbooks and Optional References


  • Learning from Data, by Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin, AMLBook New York, 2012.

  • Convex Optimization, by S. Boyd and L. Vandenberghe, Cambridge University Press, 2003.



  1. Foundations

    1. The learning problem

    2. Training versus testing

    3. The linear model

    4. Overfitting

    5. Three learning principles

  2. Techniques

    1. Similarity-based methods

    2. Neural networks

    3. Support vector machines

    4. Learning aides

  3. Optimization

    1. Convex and nonconvex optimization

    2. First-order optimization algorithms

    3. Second-order optimization algorithms

    4. Stochastic optimization algorithms