Study-unit DATA INTENSIVE APPLICATION AND BIG DATA

Course name Computer engineering and robotics
Study-unit Code A003171
Curriculum Comune a tutti i curricula
Lecturer Fabrizio Montecchiani
Lecturers
  • Fabrizio Montecchiani
Hours
  • 48 ore - Fabrizio Montecchiani
CFU 6
Course Regulation Coorte 2023
Supplied 2024/25
Supplied other course regulation
Learning activities Caratterizzante
Area Ingegneria informatica
Sector ING-INF/05
Type of study-unit Opzionale (Optional)
Type of learning activities Attività formativa monodisciplinare
Language of instruction Italian.
Contents -Introduction to Big Data
-Programming models and technologies for distributed computing
-Distributed databases and NoSQL technologies
Reference texts The course presents methods and technologies that are not covered by a single textbook. To support the student, the topics covered during the lectures are presented in the slides provided by the teacher.
Educational objectives The aim of the course is to provide both theoretical and practical notions on the design and development of data-intensive applications.
Prerequisites Knowledge is required about the design and analysis of algorithms, imperative and object programming (Java language), and relational databases.
Teaching methods The course is divided into two main types of lessons:

Lectures (about 60% of total time): lessons held in the classroom. In each lesson new concepts are taught with the support of projected slides.

Laboratory guided exercises (for about 40% of total time): lessons held in the software engineering lab. In each lesson the students design and implement new programs under the guidance of the teacher.
Other information None.
Learning verification modality The assessment methods of this course aim to estimate the theoretical knowledge of the student and his/her ability to apply this knowledge to solve both theoretical and practical problems. The different types of tests are described hereunder.

- Oral test with theoretical and practical exercises

Duration: 30 minutes

Score: 15/30

Aims: Assess the knowledge of the different theoretical notions provided by the course and the ability of developing simple programs.


- Project

Presentation and discussion of a project work (software plus documentation)

Score: 15/30

Aims: Assess the practical abilities of the student with respect to the topics covered in the course.
Extended program The program may be updated before the beginning of the lessons.

1. Introduction
a. Introduction to Big Data
b. Scaling up vs scaling out
c. Key ideas for Big Data management
2. Part I: Programming models and technologies for distributed computing
a. The MapReduce model
b. The Hadoop platform
c. Apache Spark
3. Part II: Data models and NoSQL technologies
a. Basic principles of distributed databases
b. The CAP theorem and beyond
c. NoSQL technolgies
d. Vector databases
Obiettivi Agenda 2030 per lo sviluppo sostenibile