Course 2009-2010 a.y.



Department of Decision Sciences

Course taught in English

Go to class group/s: 21
EMIT-LS (8 credits - II sem. - CC)
Course Director:

Classes: 21 (II sem.)

Course Objectives

The key aim of this course is providing the students with basic skills in multivariate data analysis. In particular, students learn techniques and methods useful to analyze and synthesize rich data sets (e.g. cluster and factor analyses), with respect to both the number of variables and the number of observations. All methods are taught through hands-on classes, during which the students analyze a number of databases relevant to their studies (e.g. R&D data, patent data, investment data etc.).

Course Content Summary


  • Matrix algebra.
  • Multivariate random variables. Moments of multivariate distributions. Multivariate samples, summary statistics for multivariate samples. Geometric interpretation of data matrices. Total and generalized variance and their geometric interpretation.

Factorial Techniques

  • Principal component (PC) analysis. PC transformation. Property of PCs and their interpretation. Evaluation of results. Sample PC.
  • Factor analysis. The Factor model: definition and assumptions. Parameter estimates: the principal component and the principal factor methods. Interpretation of factors: factors rotation. Factor Scores.
  • Association for qualitative variables. Simple and multiple correspondence analysis. Profiles and Chi-square metric. Indicator matrices and Burt matrix. Factors and their interpretation. Graphical representation and analysis of results.

Dissimilarity matrices and clustering

  • Cluster analysis. Distance and dissimilarity matrices. Hierarchical classification methods. Choice of the number of cluster. Partitioning methods: the k-means method. Evaluation of results.
  • Multidimensional scaling (MDS). Representing one or more dissimilarity matrices in a factorial plane. Relationship with factor analysis and cluster analysis.

Detailed Description of Assessment Methods

Attending students

The course grade is based upon

  • 2 assignments (handed in and discussed during the lessons)
  • Practical analysis - Analysis of a real data set (Pc-lab session 4 hours)
  • Theoretical exam (written exam concerning the methodological issues discussed during the course).

Not attending students

A student is considered as not attending if she did not hand in or discussed both the assignments.
For not attending students the final grade will be based on an

  • Extended practical analysis
  • Extended theoretical exam.

The practical and theoretical exam must be given in the same session. It is not possible to combine results from different sessions.


  • R.A. JOHNSON, D.W. WICHERN, Applied Multivariate Statistical Analysis, Prentice Hall, 2002, 5th ed.


  • J. LATTIN, J.D. CARROLL, P.E. GREEN, Analyzing Multivariate Data, Thomson, 2003
Exam textbooks & Online Articles (check availability at the Library)


  • Basic notions of statistics. Descriptive statistics univariate and bivariate. Most relevant inferential concepts (samples, statistics, estimators, hypothesis testing, p-values)
  • Students are expected to be able to work with Excel and Word (basic skills).


Last change 26/03/2009 15:51