Course 2025-2026 a.y.

20600 - DEEP LEARNING FOR COMPUTER VISION

Department of Computing Sciences

Course taught in English

DSBA (6 credits - I sem. - OP | ING-INF/05)

Course Director:
FABRIZIO IOZZI

Classes: 31 (I sem.)

Instructors:
Class 31: TO BE DEFINED

Suggested background knowledge

Students should be familiar with: Linear algebra; Rudiments of probability and statistics; Basics of machine learning and model fitting (overfitting and underfitting concepts): Neural networks (multi-layer perceptron and backpropagation); Python programming.

PREREQUISITES

Linear algebra, rudiments of probability and statistics. Basics of machine learning and model fitting (overfitting and underfitting concepts), neural networks (multi-layer perceptron and backpropagation). Good knowledge of Python.

Mission & Content Summary

MISSION

Computer Vision is a rapidly evolving field with applications in areas such as search, medicine, robotics, and autonomous vehicles. At the heart of many of these systems are visual recognition tasks like image classification, object detection, and segmentation. In recent years, deep learning has significantly advanced the performance of these tasks, often outperforming traditional hand-crafted methods. This course offers a deep dive into the use of deep neural networks for computer vision, starting with core concepts like Convolutional Neural Networks (CNNs) and progressing to more advanced models for complex vision problems. Students will gain both theoretical understanding and practical experience through hands-on assignments and projects. Students will implement and train their own models, apply them to real-world datasets, and complete a final project involving large-scale neural networks. By the end of the course, they will be equipped with the skills to tackle a wide range of visual recognition challenges using modern deep learning techniques.

CONTENT SUMMARY

Convolutional neural networks are mature, flexible, and powerful non-linear data-driven models that have successfully been applied to solve complex tasks in science and engineering. The advent of the deep learning paradigm, i.e., the use of neural networks to simultaneously learn an optimal data representation and the classification model, has further the data-driven paradigm. These topics will be described in the course according to the following detailed program:

Introduction to Computer Vision and basics of digital images
Basics of image transformations and image filtering
Image Classification with Linear Classifiers
Convolutional Neural Networks for Image Classification
CNNs Architectures
Advanced Deep Learning architectures
Object Detection, Image Segmentation
Techniques for Visual Data Visualization and Interpretation
Unsupervised and Self-supervised Learning
Generative Models
Emerging Topics in Vision: Video Understanding, 3D Perception, Multimodal Models

Intended Learning Outcomes (ILO)

KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...

Identify the right CNN architecture to solve different visual recognition problems
Recognize the best practices, leveraging the most popular dropout, data augmentation
Describe and get inspiration from the most successful Deep Learning architectures
Explain the most successful Computer Vision applications to be solved by Deep Learning models
Illustrate complex techniques beyond the fundamental ones presented during lectures

APPLYING KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...

Analyze a specific Computer Vision problem and find which model best solves the task at hand
Use fundamental deep learning algorithms for Computer Vision autonomously
Compare the various models and find the most relevant to be applied in the specific problem
Examine the selected model in order to balance performance, computational complexity and overfitting
Discuss the pros and cons of different Computer Vision techniques for a specific problem
Develop new pipelines adapting to the specific problem at hand

Teaching methods

Practical Exercises
Individual works / Assignments
Collaborative Works / Assignments

DETAILS

The course follows an interactive and hands-on teaching modality with a strong emphasis on practical aspects. On top of the laboratory sessions, customarily held after most lectures, the course leverages project-based learning to enable students to apply the principles covered during lectures to real-world computer vision tasks.
During Practical Session carefully selected sample codes cover the key components of image analysis, and convolutional neural networks for image classification, segmentation, object recognition, and image generation. Students are encouraged to follow along and experiment with the code to gain a solid grasp of the underpinning concepts.
Projects are assigned to groups to foster a deeper understanding of the subject. The students are divided into teams, and will phase two step-projects. The first phase, which will take place during the first half of the course, is meant to teach the students how to use CNN models for solving a basic visual recognition task. In the second phase, students are invited to choose a specific computer vision problem to be solved by advanced deep learning models. The projects need to be diverse among the teams, challenging, and relevant to current real-world applications.
During the project development, students are expected to take advantage of the methods and skills presented during lectures for solving their specific task. At the end of the course, each team presents their projects to the entire class. This presentation fosters a collaborative learning environment where teams can learn from each other's successes and challenges.

Assessment methods

	Continuous assessment	Partial exams	General exam
Written individual exam (traditional/online)			x
Individual Works/ Assignment (report, exercise, presentation, project work etc.)	x

ATTENDING AND NOT ATTENDING STUDENTS

Individual assignments test the students' skills about:

- analyze a specific Computer Vision problem and find which model best solves the task at hand

- use fundamental deep learning algorithms for Computer Vision autonomously

- develop new pipelines adapting to the specific problem at hand

The written exam will also test the students' proficiency in

- comparing the various models and find the most relevant to be applied in the specific problem

- examinin the selected model in order to balance performance, computational complexity and overfitting

- discussing the pros and cons of different Computer Vision techniques for a specific problem

Teaching materials

ATTENDING AND NOT ATTENDING STUDENTS

Learning Materials will be provided by the Teachers in the online platform.

Last change 14/05/2025 14:27