Study-unit SIGNAL PROCESSING AND OPTIMIZATION FOR BIG-DATA

Course name Computer engineering and robotics
Study-unit Code A001256
Curriculum Data science e data engineering
Lecturer Paolo Banelli
Lecturers
  • Paolo Banelli
Hours
  • 72 ore - Paolo Banelli
CFU 9
Course Regulation Coorte 2023
Supplied 2024/25
Learning activities Affine/integrativa
Area Attività formative affini o integrative
Sector ING-INF/03
Type of study-unit Obbligatorio (Required)
Type of learning activities Attività formativa monodisciplinare
Language of instruction Italian
Contents - RECALLS
of STATISTICAL SIGNAL PROCESSING BASICS
-FUNDAMENTALS of CONVEX OPTIMIZATION
- BIG-DATA REDUCTION and SAMPLING
- GRAPH-BASED SIGNAL/DATA PROCESSING
- DISTRIBUTED OPTIMIZATION AND SIGNAL PROCESSING for LEARNING over NETWORKS
Reference texts Most of the class content will be inspired to some chapters and paragraphs of these books:
- S.Kay, Fundamentals of Statistical Signal Processing, Vol. I & II, Prentice Hall, 1993-1998;
- S. Theodoridis, Machine Learning: A Bayesian and optimization perspective.
- T. Hastie, et. al., The Elements of Statistical Learning: data Mining, Inference, and Prediction
- M. E. J Newman, Networks an Introduction- S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004;
- S. Boyd et al., Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, Foundations and Trends in Machine Learning, 3(1):1–122, 2011- Furthermore some notes of the teacher will be available.
Educational objectives Understanding and applying the basics of statistical inference and convex optimization to (big)-data analytics. Understanding the concept of data-reduction/sampling and conditions under which statistical inference and reconstruction of the information does not suffer too much by reduction/sampling. Extend the knowledge of classical signal processing to signals defined over graphs, which is a natural representation of big-data either dependent on their distribution over a network, or on their statistical similarity, or both. Understand the methodological tools to distribute complex statistical inference on parallel and distributed agents (computers, etc.) as a way to empower statistical inference on big-data, possibly geographically or logically distributed over a network. Learning from observed data the topological structure that characterizes their generation and evolution.
Prerequisites Mandatory: Calculus, Linear Algebra, Random Variables and Stochastic processes, Fourier Analysis, Digital signal processing.Suggested: Machine Learning and Data Mining. Useful: Estimation and Detection Theory (Statistical Inference)
Teaching methods The class will be given face-to-face by the lecturer with the aid of computer-slides. Furthermore some of the algorithms will be also implements by PC-based simulations, interactively with the students.
Other information
Learning verification modality 1) Short Thesis on a topic related to the class content, with computer aided simulations. To be given 1 week before the oral exam.
2) Oral Exam: Discussion of the Thesis plus typically 2 questions.
Extended program - Part I: RECALLS on BASICS OF STATISTICAL INFERENCE AND LEARNING (6 hours)
Recalls on estimators, frequentist and Bayesian, performance indicators and common estimators (MVUE, MLE, MMSE, LS, etc.)
Recalss on binary hypothesis testing: likelihood ratio test (LRT), Neyman-Pearson and Bayesian perspectives (Minimum error probability, MAP, Bayes Risk).
Statistical learning and relationship with machine-learning: linear regression, K-means, etc.

- Part II: FUNDAMENTALS OF (DISTRIBUTED) CONVEX OPTIMIZATION (15 hours ) Basics of convex optimization: Convex sets, convex functions, convex optimization problems; Duality theory: Lagrange dual problem, Slater's constraint qualifications, KKT conditions; Optimization algorithms: Primal methods (steepest descent, gradient projection, Newton method), primal-dual methods (dual ascent, alternating direction method of multipliers);Examples of applications: Approximation and fitting, statistical estimation and detection, adaptive filtering, supervised and unsupervised learning from data;
Distributed optimization: Consensus and sharing; Distributed optimization: Primal and primal-dual methods;

- Part III: BIG-DATA REDUCTION (9 hours) Compressed Sampling/Sensing and reconstruction. Statistical Inference by Sparse Sensing, Classification by Principal Component Analysis, Canonical Correlation Analysis, and Information Bottleneck.

- Part IV: GRAPH-BASED SIGNAL PROCESSING (15 hours) Signals on graph: motivating examples; algebraic graph theory, graph features; signal processing on graphs: Fourier Transform, smoothing, sampling, and data compression on graph;

- Part V: DISTRIBUTED OPTIMIZATION, SIGNAL PROCESSING, and LEARNING over NETWORKS 27 hours) Average consensus: Theory and algorithms; Distributed signal processing: Estimation and detection; Distributed signal processing: LMS, RLS and Kalman Filtering on Graphs. Distributed supervised learning (LASSO, SVM, Logistic Regression) Distributed unsupervised learning: Dictionary, learning and data clustering: learning of eigenvector and eigenvalues of Laplacian matrices. Graph learning: Gaussian Markov Random Fields and Graphical LASSO, Smoothness and Total Variation approaches, Gaussian processes for directed causal inference.
Matrix Completion algorithms.