20570 - DATA ANALYTICS AND VISUALIZATION
Department of Decision Sciences
RAFFAELLA PICCARRETA
Suggested background knowledge
Mission & Content Summary
MISSION
CONTENT SUMMARY
Data analytics is a broad term that defines the activities in the process of analysing data to draw meaningful and actionable insights. It involves a number of steps and procedures, including:
• Data manipulation and analysis, aimed at discovering the salient patterns in data
• Visualisation (e.g. effective presentation) of results, interpretation and communication to stakeholders, in order to drive business strategy and outcomes.
The course introduces exploratory techniques to efficiently analyse, summarize and visualize data collected on (relatively) large sets of data. The goal is to reduce the dimension of data while preserving information about the most salient/distinctive features. Such simplification applies both to variables and to cases.
The course is articulated as follows:
· Introduction to multivariate data
In the first part of the course, summaries of data collected on many variables will be introduced, by extending to the multivariate case central tendency and dispersion measures
· Dimensionality reduction techniques
We will introduce Principal Components and Factor analysis, two techniques aimed at discovering low-dimensional indicators/summaries that capture some structure underlying the (possibly high-dimensional) input data
· Clustering techniques
The last part of the course introduces techniques to group cases based on their similarities or differences.
Beyond traditional classes, the course features hands-on classes, where the statistical software R - and in particular the integrated development environment (IDE) RStudio - is used to apply the considered techniques (Principal Components Analysis, Factor Analysis, and Cluster Analysis) to real data, and to properly interpret and present results, via suitable visualisation tools.
Intended Learning Outcomes (ILO)
KNOWLEDGE AND UNDERSTANDING
· Identify the technique most suitable to simplify relevant information in a dataset with reference to a specific goal of analysis.
· Recognize appropriate and inappropriate applications and approaches with reference to a specific goal of analysis.
· Justify the adoption of a specific path of analysis and of the choices made during the analysis.
· Compare the results obtained using different approaches, evaluate the stability of results.
· Write R scripts to analyse data
APPLYING KNOWLEDGE AND UNDERSTANDING
· Design/develop scripts in the R-programming language to read, manipulate, analyse and visualise data
· Interpret and critically analyse results, emphasizing the most relevant conclusions both from a technical and from an interpretative point of view.
· Effectively present the output, using suitable visualization tools allowing an immediate and unbiased understanding of the most salient features in data.
Teaching methods
- Face-to-face lectures
- Exercises (exercises, database, software etc.)
- Group assignments
DETAILS
During the course, there will be 3 blocks of hands-on classes, one for each of the three techniques taught during the course. For each technique, teams of students will work on an assignment concerning a substantive problem using data analysis.
Such assignments aim at assessing the ability to design a work flow to analyse data using the software R, as well as the ability to draw substantive conclusions based on the software output.
During each hands-on class, teams will answer to the specific questions presented in class writing a memorandum uploaded on Bboard by the end of the class.
Each block of hands-on classes will be followed by a session where individual tests will be administered containing questions on the theoretical aspects of the considered technique, on the results obtained during the hands-on classes, and on the aspects taken into account to develop the analysis presented by the students with their team.
Students who actively participate to groups work and give individual tests can give the exam as attending (see the session on assessment methods for details and rules)
Assessment methods
Continuous assessment | Partial exams | General exam | |
---|---|---|---|
|
x | x | |
|
x | ||
|
x | ||
|
x |
ATTENDING STUDENTS
Effective class participation includes attendance, preparation, making an active and constructive contribution to the class discussion, asking questions, making constructive comments, and having a positive attitude toward learning.
To be considered attending, students must participate to the activities described below.
- During the course, there will be 3 blocks of hands-on classes, one for each of the three techniques taught during the course. For each technique, teams of students will work on an assignment concerning a substantive problem using data analysis.
Such assignments aim at assessing the ability to design a work flow to analyse data using the software R, as well as the ability to draw substantive conclusions based on the software output.
Students must be able and ready to contribute to their team’s assignment, both with respect to the R-commands needed to perform the required analyses and with respect to the knowledge of the technique, in order to contribute both to the definition of the path of analysis and to the interpretation and critical evaluation of the obtained results. During each hands-on class, teams will answer to the specific questions presented in class writing a memorandum uploaded on Bboard by the end of the class.
- Each block of hands-on classes will be followed by a session where individual tests will be administered containing questions on the theoretical aspects of the considered technique, on the results obtained during the hands-on classes, and on the aspects taken into account to develop the analysis presented by the students with their team.
Such tests aim at assessing the knowledge on the techniques introduced in the course, also with respect to the obtained output.
To measure the acquisition of the learning outcomes, the students’ assessment is based on three main components:
- The team assignment will count for the 20% of the final grade (6 points overall, 2 points for each block of hands-on classes). Students should be aware that a peer review process will be in place, and that critical situations reported by peers might imply substantial reduction of the final grade
- The individual tests taken during the course will count for the 30% of the final grade (9 points overall, 3 points for each individual test)
- A final exam (denoted as S - scritto - on the Bocconi website) at the end of the course – counting for the 50% of the final grade (15 points overall) – consisting in an in-class (lab) computer assignment.
Students will use their own laptop to analyse a set of data using the techniques illustrated during the course, writing a script from the scratch using the software R and preparing a short report with their analysis, also offering a substantive interpretation of the obtained results.
The exam aims at assessing the individual ability to apply the techniques illustrated during the course, to coherently design a work flow to analyse data using the software R and to draw substantive conclusions on the data at hand based on the software output.
Below are the dates/hours scheduled for the 3 hands on classes and the individual tests for the three modules:
PCA: 5/3/2024 (10.15), 6/3/2024 (8.30), 7/3/2024 (16.30), 8/3/2024 (14.45)
FA: 9/4/2024 (10.15), 10/4/2024 (8.30), 11/4/2024 (16.30), 12/4/2024 (14.45)
CA: 7/5/2024 (10.15), 8/5/2024 (8.30), 9/5/2024 (16.30), 10/5/2024 (12.00)
Important:
- Students who skip more than one block of hands-on classes and more than one individual test cannot qualify as attending. Partial participation to team work will imply a proportional reduction of the grade on team assignment
- Students who skip one or more hands on classes will have their grade in the group work proportionally reduced
- Students who skip the individual test will not be allowed to retake it
- Students who sign as attending and are not present in class - besides the consequences stated in the honour code - will not be allowed to take the exam as attending students.
- There is no midterm exam.
- To be admitted to the final exam it is mandatory to register to it. No exception will be made to this rule
- Students of the past years who already sat for the final (practical) exam and/or who participated to the teams assignment in the past years cannot qualify as attending. This is in line with the rules stated in the syllabi of the past years. The same rule will apply to the students enrolled in the current academic year.
NOT ATTENDING STUDENTS
The non-attending students can take a final exam at the end of the course. Such exam will be articulated into a practical exam similar to the final exam for the 70% of the final grade (21 points) and a theoretical exam counting for the 30% of the final grade (9 points).
Important:
- There is no midterm exam.
- To be admitted to the final exam it is mandatory to register to it. No exception will be made to this rule
- Students of the past years who already sat for the final (practical) exam and/or who participated to the teams assignment in the past years must take the exam as not attending. This is in line with the rules stated in the syllabi of the past years, and is coherent with the structure of the exam in the past years.
Teaching materials
ATTENDING AND NOT ATTENDING STUDENTS
to be defined