Course 2024-2025 a.y.

20878 - COMPUTER VISION AND IMAGE PROCESSING

Department of Computing Sciences

Course taught in English

AI (8 credits - II sem. - OB | ING-INF/05)

Course Director:
GIACOMO BORACCHI

Classi: 29 (I/II sem.)

Docenti responsabili delle classi:
Classe 29: FABRIZIO IOZZI

Conoscenze pregresse consigliate

Being familiar with: Linear algebra, Rudiments of probability and statistics, Basics of machine learning and model fitting (overfitting and underfitting concepts), Neural networks (multi-layer perceptron and backpropagation), Python programming. These skills are definitely a plus for attending this course.

Mission e Programma sintetico

MISSION

In recent years, computer vision and image processing have gained a lot of attention, thanks to the advent of deep neural networks. These machine learning models have demonstrated outstanding performance in solving many complex tasks, achieving results that were impossible-to-believe a few years ago. Specifically, in many Computer Vision applications, like Image Recognition, Object Recognition, Image Segmentation and Image Generation, deep learning approaches outperform traditional hand-crafted algorithms, approaching human performance. This course aims at providing a clear understanding on how it is possible to make computers interpret and analyze digital images. Students will become acquainted with the theoretical background and the practical skills to understand and use Deep Learning models, and in particular Convolutional Neural Networks, for solving visual recognition problems. On top of that, the course covers the basic principles of Geometric Computer Vision and of Image Processing, to provide a solid ground for imaging specialists, who can recognize whether a problem at hand needs to be solved using deep learning or on traditional techniques. Overall, this course offers a broad overview in Computer Vision and Image Processing, illustrating the flagship problems addressed in these domains. Particular emphasis will be given to the image formation process both from a geometric (pinhole camera) and photometric (sensor noise) perspective. Students will become acquainted

PROGRAMMA SINTETICO

The course develops the following detailed program:

● Course introduction and a general overview of imaging science.

● Basics of digital images, the image formation process from a photometric perspective.

● Basics of image filtering (correlation and convolution).

● The geometry of image formation (pinhole camera model, homogeneous coordinates).

● Principles of single-view geometry.

● Principles of two-view and multi-view geometry.

● The Image Classification Problem and image classification by hand-crafted features.

● Convolutional Neural Networks for Image Classification.

● Famous CNN architectures.

● CNN training with data scarcity: transfer learning and data augmentation.

● CNN Visualization nd CNN Explanations.

● Fully Convolutional CNN and CNN for Image Segmentation.

● Object Detection Network.

● Unsupervised Models, Autoencoders.

● Generative Adversarial Networks.

● Introduction to Vision Transformer

● Multimodal Imaging Models: CLIP.

Risultati di Apprendimento Attesi (RAA)

CONOSCENZA E COMPRENSIONE

Al termine dell'insegnamento, lo studente sarà in grado di...

Determine whether a given imaging problem needs to be solved by traditional Computer Vision / Image Processing techniques or by Deep Learning models. Identify then the right CNN architecture to solve different visual recognition problems.
Recognize the best practices for CNN training.
Describe and get inspiration from the most successful Deep Learning architectures,
Explain the most successful Computer Vision applications to be solved by Deep Learning models.
Illustrate complex techniques beyond the fundamental ones presented during lectures.

CAPACITA' DI APPLICARE CONOSCENZA E COMPRENSIONE

Al termine dell'insegnamento, lo studente sarà in grado di...

Analyze a specific Computer Vision problem and find which model best solves the task at hand.
Use fundamental deep learning algorithms for Computer Vision autonomously.
Compare the various models and find the most relevant to be applied in the specific problem.
Examine the selected model in order to balance performance, computational complexity and overfitting.
Discuss the pros and cons of different Computer Vision techniques for a specific problem.
Develop new pipelines adapting to the specific problem at hand.

Modalità didattiche

Esercitazioni pratiche
Lavori/Assignment di gruppo

DETTAGLI

The course follows an interactive and hands-on teaching modality with a strong emphasis on practical aspects. On top of the laboratory sessions, customarily held after most lectures, the course leverages project-based learning to enable students to apply the principles covered during lectures to real-world computer vision tasks. During the Practical Sessions, carefully selected sample codes will cover the key components of computer vision and image processing, as well as deep learning (convolutional neural networks for image classification, segmentation, object recognition, and image generation). Students are encouraged to follow along and experiment with the code to gain a solid grasp of the underpinning concepts. Projects are assigned to groups to foster a deeper understanding of the subject. The students are divided into teams, and will phase two step-projects. The first phase, which will take place during the first half of the course, is meant to teach the students how to use CNN models for solving a basic visual recognition task. In the second phase, students are invited to choose a specific computer vision problem to be solved by Geometric Computer Vision/Image Processing techniques *and* advanced deep learning models. The projects need to be diverse among the teams, challenging, and relevant to current real-world applications. During the project development, students are expected to take advantage of the methods and skills presented during lectures for solving their specific task. At the end of the course, each team presents their projects to the entire class. This presentation fosters a collaborative learning environment where teams can learn from each other's successes and challenges.

Metodi di valutazione dell'apprendimento

	Accertamento in itinere	Prove parziali	Prova generale
Prova individuale scritta (tradizionale/online)			x
Lavori /Assignment individuale (relazione, esercizio, dimostrazione, progetto etc.)	x

STUDENTI FREQUENTANTI E NON FREQUENTANTI

Individual assignments test the students' skills about:

- analyze a specific Computer Vision problem and find which model best solves the task at hand

- use fundamental deep learning algorithms for Computer Vision autonomously

- develop new pipelines adapting to the specific problem at hand

The written exam will also test the students' proficiency in

- comparing the various models and find the most relevant to be applied in the specific problem

- examinin the selected model in order to balance performance, computational complexity and overfitting

- discussing the pros and cons of different Computer Vision techniques for a specific problem

Materiali didattici

STUDENTI FREQUENTANTI E NON FREQUENTANTI

Slides and Links to reference papers will be distributed. Also Colab notebooks will be provided.

Modificato il 17/09/2024 16:02