Course 2025-2026 a.y.

21016 - MACHINE LEARNING

Department of Decision Sciences

Course taught in English

DAIHS (8 credits - I sem. - OB | SECS-S/01)

Course Director:
ZORAIDA FERNANDEZ RICO

Classes: 48 (I sem.)

Instructors:
Class 48: TO BE DEFINED

Mission & Content Summary

MISSION

The course addresses core topics in statistical machine learning that are central to modern data analysis, with a particular focus on their application to health and biomedical sciences. As digitalization in healthcare leads to increasingly large, high-dimensional, and complex datasets, there is a growing need for methods that combine predictive accuracy with interpretability. This course equips students with both theoretical understanding and practical tools to meet this challenge. It covers essential models and algorithms—from classical regression and resampling to tree-based methods, support vector machines, and graphical models— along with their applications in uncertainty quantification and multiple testing. By bridging statistical theory with hands-on implementation in Python, the course provides a solid foundation for students to critically apply machine learning methods to real-world problems in health sciences, and contributes to forming professionals capable of both methodological innovation and responsible application in AI-driven healthcare.

CONTENT SUMMARY

1. An introduction to statistical machine learning. Supervised learning; prediction accuracy vs model interpretability; assessment of model accuracy; methods for evaluating predictive uncertainty.

2. Review of regression and classification.

3. Bias-variance trade-off and shrinkage methods.

4. Resampling methods: cross-validation and bootstrap.

5. Nonlinear models: polynomial regression, regression and smoothing splines, generalized additive models.

6. Tree-based methods and their applications in survival analysis: regression and classification trees; bagging; random forests; support vector machines; kernel methods; boosting.

7. Large-scale hypothesis testing: FWER, Bonferroni, FDR, Benjamini-Hochberg, resampling-based methods.

8. Learning with directed and undirected graphical models. Identification of conditional independencies. Exact and approximate inference methods.

Intended Learning Outcomes (ILO)

KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...

Design and perform data-driven analyses for interpretation, prediction, and classification, addressing various aspects and subtleties of model selection.
Understand and apply the bias-variance trade-off and shrinkage methods.
Implement resampling methods for reliable model assessment.
Identify and use tree-based and kernel-based algorithms.
Explain principles of large-scale hypothesis testing and control of error rates.

APPLYING KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...

Formalize real-world problems as precise statistical questions.
Make informed judgments by selecting appropriate machine learning tools.
Assess statistical significance and uncertainty of analysis results

Lifelong Learning Skills

By the end of the course, students will be able to:

● Balance the trade-off between model complexity and performance in learning algorithms.

● Develop algorithms to address relevant and evolving learning problems.

● Adapt their analytical skills to ongoing data challenges and emerging technologies in health and biomedical sciences.

Teaching methods

Lectures

DETAILS

The course will be delivered primarily through face-to-face lectures on campus, combining theoretical explanations with practical examples and interactive discussions.

Assessment methods

	Continuous assessment	Partial exams	General exam
Written individual exam (traditional/online)		x	x

ATTENDING AND NOT ATTENDING STUDENTS

All Students: The assessment consists of a single final exam composed of two parts:

● A theoretical part to test understanding of foundational concepts.

● An applied part to evaluate practical problem-solving and implementation skills.

This structure ensures alignment with both knowledge and application-oriented learning outcomes.

Teaching materials

ATTENDING AND NOT ATTENDING STUDENTS

The following books and resources will be used as primary and supplementary references:

● James, G., Witten, D., Hastie, T., Tibshirani, R. (2023). An Introduction to Statistical Learning (Python edition). Springer. (freely available in PDF format at https://www.statlearning.com/)

● Hastie, T., Tibshirani, R., Friedman, J. (2009). The Elements of Statistical Learning. Springer. (freely available in PDF format at https://hastie.su.domains/ElemStatLearn/)

● Murphy, K. (2022). Probabilistic Machine Learning: An Introduction. MIT Press. (freely available in PDF format at https://probml.github.io/pml-book/book1.html)

● Højsgaard, S., Edwards, D., & Lauritzen, S. (2012). Graphical models with R. Springer Science & Business Media.

Last change 04/06/2025 12:23