20570 - DATA ANALYTICS AND VISUALIZATION
Course taught in English
Go to class group/s: 22
- Multivariate samples. Summary statistics for multivariate samples. Geometric interpretation of data matrices. Space of the cases and distances. Total and generalized variance and their geometric interpretation.
- Principal component analysis (PCA). PC transformation. Property of PCs and their interpretation. Evaluation of results, and graphical representations.
- Factor analysis (FA). The Factor model: definition and assumptions. Parameter estimates: the principal component and the principal factor methods. Interpretation of factors: factors rotation. Factor Scores and factorial maps.
- Simple correspondence analysis (SCA). Association between categorical variables. Profiles and Chi-square metric. Factors and their interpretation. Graphical representation and analysis of results.
- Cluster analysis (CA). Distance and dissimilarity matrices. Hierarchical and partitioning clustering methods. Choice of the number of clusters. Criteria for the evaluation of a partition.
The final grade for attending students is based on a practical and on a theoretical exam, exactly as described for not attending students. Nonetheless, attending students can give the theoretical exam in two partial exams. The first partial exam concerns PCA and FA. The second partial exam concerns CA and SCA.
A student is considered as attending if
- He/she attended at least 4 of the lab sessions dedicated to PCA and FA (and at least one lab for each technique) and at least 4 of the lab sessions dedicated to CA and SCA (and at least one lab for each technique).
For non attending students
The final grade is based on
- A practical exam. Analysis of a real data set (Pc-lab session).
- A theoretical exam (written exam concerning the methodological issues discussed during the course and possibly comments on software output).
- The slides of the course are made available on the blackboard.
- R.A. JOHNSON, D.W. WICHERN, Applied Multivariate Statistical Analysis, Prentice Hall, 2002, 5th edition.
- J. LATTIN, J.D. CARROLL, P.E. GREEN, Analyzing Multivariate Data, Thomson, 2003.
- Basic notions of statistics. Descriptive statistics (univariate and bivariate). Most relevant inferential concepts (samples, statistics, estimators, hypothesis testing, p-values).
- Students are expected to be able to work with Excel and Word (basic skills).