30416 - BIG DATA AND DATABASES
Department of Decision Sciences
LUCA MOLTENI
Mission & Content Summary
MISSION
CONTENT SUMMARY
- Introduction to data management and analytics.
- Data management architectures: relational databases (OLTP, Data warehouse and SQL language).
- Data management architectures: Big data and NoSQL databases (distributed file system, Hadoop, Spark and Data Lake concept).
- Data analytics: Data understanding and data preparation.
- Data analytics: Models and statistical techniques applied to Big Data.
- Regression and classification trees.
- Ensemble methods (random forest and boosted trees).
- Logistic regression.
- Supervised Artificial Neural Networks.
- Data analytics: Models' performance evaluation.
Intended Learning Outcomes (ILO)
KNOWLEDGE AND UNDERSTANDING
Get the following competences:
- Big Data ingestion and management.
- Data preparation and cleaning.
- Machine learning algorithms application.
- Machine learning model evaluation.
APPLYING KNOWLEDGE AND UNDERSTANDING
- Improve his skills to manage and to take advantages of the huge availability of data nowadays produced by a great variety of sources.
Teaching methods
- Face-to-face lectures
- Guest speaker's talks (in class or in distance)
- Exercises (exercises, database, software etc.)
- Case studies /Incidents (traditional, online)
- Group assignments
DETAILS
Two different approaches are used: theoretical and applicative. A number of data ingestion procedures and machine learning data analysis case histories are shown, on Big and Small Data, using specific data management anf machine learning software. At the end of the course, students are able to reply all the procedures and analysis by themselves.
Assessment methods
Continuous assessment | Partial exams | General exam | |
---|---|---|---|
|
x | ||
|
x | ||
|
x | ||
|
x |
ATTENDING STUDENTS
It is based both on a group assignment (to be developed during the course and submitted before the end of the lessons; 50% of the final grade) and on an individual final written exam (50% of the final grade), proposed in a reduced version compared to the full not-attending exam. The final exam covers mainly the data ingestion and management topics. the assignment mainly the data analysis topics.
NOT ATTENDING STUDENTS
Individual final written exam (100% weight).
Teaching materials
ATTENDING AND NOT ATTENDING STUDENTS
- M. KUHN, K. JOHNSON, Applied predictive modeling, Springer, 2013.
- P. WILTON, J. W. COLBY, Beginning SQL, Wrox, March 04, 2005.
- N. DASGUPTA, Practical Big Data Analytics, Packt Publishing, 2018.