BIG DATA AND DATA MINING
cod. 1009070

Academic year 2021/22
1° year of course - Second semester
Professor
- Flavio BERTINI
Academic discipline
Informatica (INF/01)
Field
Discipline informatiche
Type of training activity
Characterising
48 hours
of face-to-face activities
6 credits
hub:
course unit
in ITALIAN

Learning objectives

At the end of the course the student should have acquired knowledge and skills related to knowledge representation techniques and data mining algorithms. In particular, the student is expected to be able to:
- Know the main problems of Big Data and the objectives of Data Mining.
- Know the main techniques of knowledge representation.
- Knowing how to use formalisms appropriately for the representation of knowledge.
- Knowing how to use the main data mining techniques and algorithms.
- Knowing how to present a work project.
- Be able to analyze a problem and develop a data mining project.

Prerequisites

Good knowledge of the relational data model is strongly recommended. Knowledge of imperative programming languages.

Course unit content

■ Semi-structured and unstructured data models
■ The limits of SQL and an introduction to SQL/XML and XQuery
■ The information retrieval models and web information retrieval
■ The datawarehousing and data mining

Full programme

■ Part I
■ Introduction
■ Semi-structured and unstructured data models
■ Part II
■ XML introduction
■ SQL/XML language
■ XQuery language
■ XQuery and database management system
■ NoSQL database
■ Part III
■ Information Retrieval introduction
■ Ranking
■ Web Information Retrieval
■ Information Retrieval evaluation
■ Advanced methods
■ Part IV
■ Data analytics
■ Data warehouse
■ Data mining: association rule, classification and clustering

Bibliography

■ A. Moller, M. Schwartzbach - Introduzione a XML - Pearson, 2007, ISBN: 9788871923734
■ P.-N. Tan, M. Steinbach, V. Kumar - Introduction to data mining - Addison Wesley, 2005, ISBN: 0321420527
■ C.D. Manning, P. Raghavan, H. Schütze - Introduction to Information Retrieval - Cambridge University Press, 2008, ISBN: 0521865719
■ M. Golfarelli, S. Rizzi - Datawarehouse. Teoria e pratica della progettazione - McGraw-Hill Education, 2006, ISBN: 9788838662911

Teaching methods

Teaching activity partly in the classroom

Assessment methods and criteria

The assessment takes place with the discussion of a scientific article. The student explores an advanced topic starting from a research paper among those proposed and prepares a presentation to be used during the exam. The discussion will be mainly on the topics of the chosen article. The student, after the instructor's approval, can alternatively carry out a project on a topic of the course. The results of the project will have to be discussed during the exam. To take part in an exam session, you must register before 7 days of the exam date. Further health indications and restrictions may imply the activation of the remote mode for the exam.

Other information

- - -