Course 2025-2026 a.y.

20600 - DEEP LEARNING FOR COMPUTER VISION

Department of Computing Sciences

Course taught in English
31
DSBA (6 credits - I sem. - OP  |  ING-INF/05)
Course Director:
FABRIZIO IOZZI

Classes: 31 (I sem.)
Instructors:
Class 31: TO BE DEFINED


Suggested background knowledge

Students should be familiar with: Linear algebra; Rudiments of probability and statistics; Basics of machine learning and model fitting (overfitting and underfitting concepts): Neural networks (multi-layer perceptron and backpropagation); Python programming.

PREREQUISITES

Linear algebra, rudiments of probability and statistics. Basics of machine learning and model fitting (overfitting and underfitting concepts), neural networks (multi-layer perceptron and backpropagation). Good knowledge of Python.

Mission & Content Summary

MISSION

Computer Vision is a rapidly evolving field with applications in areas such as search, medicine, robotics, and autonomous vehicles. At the heart of many of these systems are visual recognition tasks like image classification, object detection, and segmentation. In recent years, deep learning has significantly advanced the performance of these tasks, often outperforming traditional hand-crafted methods. This course offers a deep dive into the use of deep neural networks for computer vision, starting with core concepts like Convolutional Neural Networks (CNNs) and progressing to more advanced models for complex vision problems. Students will gain both theoretical understanding and practical experience through hands-on assignments and projects. Students will implement and train their own models, apply them to real-world datasets, and complete a final project involving large-scale neural networks. By the end of the course, they will be equipped with the skills to tackle a wide range of visual recognition challenges using modern deep learning techniques.

CONTENT SUMMARY

Convolutional neural networks are mature, flexible, and powerful non-linear data-driven models that have successfully been applied to solve complex tasks in science and engineering. The advent of the deep learning paradigm, i.e., the use of neural networks to simultaneously learn an optimal data representation and the classification model, has further the data-driven paradigm. These topics will be described in the course according to the following detailed program: 

  • Introduction to Computer Vision and basics of digital images

  • Basics of image transformations and image filtering

  • Image Classification with Linear Classifiers

  • Convolutional Neural Networks for Image Classification

  • CNNs Architectures

  • Advanced Deep Learning architectures

  • Object Detection, Image Segmentation

  • Techniques for Visual Data Visualization and Interpretation

  • Unsupervised and Self-supervised Learning

  • Generative Models

  • Emerging Topics in Vision: Video Understanding, 3D Perception, Multimodal Models


Intended Learning Outcomes (ILO)

KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...
  • Identify the right CNN architecture to solve different visual recognition problems 

  • Recognize the best practices, leveraging the most popular dropout, data augmentation 

  • Describe and get inspiration from the most successful Deep Learning architectures 

  • Explain the most successful Computer Vision applications to be solved by Deep Learning models 

  • Illustrate complex techniques beyond the fundamental ones presented during lectures 

 

APPLYING KNOWLEDGE AND UNDERSTANDING

At the end of the course student will be able to...
  • Analyze a specific Computer Vision problem and find which model best solves the task at hand 
  • Use fundamental deep learning algorithms for Computer Vision autonomously 
  • Compare the various models and find the most relevant to be applied in the specific problem 
  • Examine the selected model in order to balance performance, computational complexity and overfitting 
  • Discuss the pros and cons of different Computer Vision techniques for a specific problem 
  • Develop new pipelines adapting to the specific problem at hand 

Teaching methods

  • Practical Exercises
  • Individual works / Assignments
  • Collaborative Works / Assignments

DETAILS

The course follows an interactive and hands-on teaching modality with a strong emphasis on practical aspects. On top of the laboratory sessions, customarily held after most lectures, the course leverages project-based learning to enable students to apply the principles covered during lectures to real-world computer vision tasks. 
During Practical Session carefully selected sample codes cover the key components of image analysis, and convolutional neural networks for image classification, segmentation, object recognition, and image generation. Students are encouraged to follow along and experiment with the code to gain a solid grasp of the underpinning concepts. 
Projects are assigned to groups to foster a deeper understanding of the subject. The students are divided into teams, and will phase two step-projects. The first phase, which will take place during the first half of the course, is meant to teach the students how to use CNN models for solving a basic visual recognition task. In the second phase, students are invited to choose a specific computer vision problem to be solved by advanced deep learning models. The projects need to be diverse among the teams, challenging, and relevant to current real-world applications. 
During the project development, students are expected to take advantage of the methods and skills presented during lectures for solving their specific task. At the end of the course, each team presents their projects to the entire class. This presentation fosters a collaborative learning environment where teams can learn from each other's successes and challenges. 


Assessment methods

  Continuous assessment Partial exams General exam
  • Written individual exam (traditional/online)
    x
  • Individual Works/ Assignment (report, exercise, presentation, project work etc.)
x    

ATTENDING AND NOT ATTENDING STUDENTS

Individual assignments test the students' skills about:

- analyze a specific Computer Vision problem and find which model best solves the task at hand 

- use fundamental deep learning algorithms for Computer Vision autonomously 

- develop new pipelines adapting to the specific problem at hand 

The written exam will also test the students' proficiency in

- comparing the various models and find the most relevant to be applied in the specific problem 

- examinin the selected model in order to balance performance, computational complexity and overfitting 

- discussing the pros and cons of different Computer Vision techniques for a specific problem 

 


Teaching materials


ATTENDING AND NOT ATTENDING STUDENTS

Learning Materials will be provided by the Teachers in the online platform.

Last change 14/05/2025 14:27