30677 - MACHINE LEARNING (MODULE I - INTRODUCTION)
Department of Decision Sciences
OMIROS PAPASPILIOPOULOS
Suggested background knowledge
Mission & Content Summary
MISSION
CONTENT SUMMARY
The course is organized along the following themes:
1. Introduction
- presentation of the goals of the course; statistics vs machine learning, data science vs artificial intelligence; supervised vs unsupervised machine learning - some case studies and some toy data sets
- overview of supervised learning by showcasing prediction results and challenges on the case studies and toy examples
- models for machine learning; loss functions; learning as an optimization problem
2. Predictive modelling pt 1
- a basic linear model; learning as a least squares problem; illustrations on case studies and toy examples
- feature engineering pt 1; models of increasing complexity; evaluating predictive performance pt 1
3. Preprocessing
- categorical (input/output) variables, transformations, basis functions, data splits
- case study: predicting with text and images
- dealing with missing data
4. Predictive modelling pt 2
- bias-variance tradeoff; best subset selection and the lasso
- optimizing hyperparameters: cross-validation
- classification pt 1: main concepts and algorithms
- classification pt 2: measuring performance and multiclass
- case studies
5. Smooth lines and curves
- regression splines
6. Predictive modelling pt 3
- regression and classification trees: concepts, interpretations and training algorithms
- bagging, random forests, and ensemble methods
7. Network data and algorithms
- introduction to networks
- network statistics and connectivity properties
- visualization and community detection
- basic models for networks
Intended Learning Outcomes (ILO)
KNOWLEDGE AND UNDERSTANDING
+ understand basic predictive algorithms
+ disinguish between prediction and causal inference
+ appreciate what are missing data and how to deal with them
+ identify network structures
+ analyze network data
APPLYING KNOWLEDGE AND UNDERSTANDING
+ build predictive algorithms
+ evaluate predictive performance
+ smooth one and two dimensional data
+ carry out network analytics
Teaching methods
- Lectures
- Practical Exercises
- Collaborative Works / Assignments
DETAILS
+ practical exercises in terms of applying algorithms on real and synthetic datasets
+ collaborative work in terms of a group project on either predictive analytics or network analytics
Assessment methods
Continuous assessment | Partial exams | General exam | |
---|---|---|---|
|
x | x | |
|
x |
Teaching materials
ATTENDING AND NOT ATTENDING STUDENTS
There will be slides provided for the methodological part of the course and they will form an important part of the reading material for understanding the main concepts. In terms of book references, one is
https://press.princeton.edu/books/hardcover/9780691222271/quantitative-social-science
which is used in Data Analytics but will also be relevant here. An advanced but very classic textbook is
https://link.springer.com/book/10.1007/978-0-387-21606-5
which is also freely available online. Note, however, that the mathematical level of the course is much more elementary than this book. Additionally, this book:
https://www.deeplearningbook.org/
which is also freely available online and can be consulted.