Communication engineering | University of Parma

Professor responsible for the course unit: BONONI Alberto

integrated course unit

6 credits

hub: PARMA

course unit
in ENGLISH

lectures timetable unavailable

exam calls

Course unit structured in the following modules:

Learning objectives

Course Objectives:

The objective of the course is to provide the student with the ability to understand the basic rules of machine learning and, in particular:
- the most common statistical tests in classification among different categories
- the structure of the optimal classifier and its error analysis
- the most common feature extraction methods from input data
- the most common statistical estimators in machine learning
- the most common clustering algorithms in unsupervised learning.

The abilities in applying the above-mentioned knowledge are in particular
in the:
- design and performance analysis of classifiers in machine learning
- selection of the most appropriate features to discriminate input categories
- selection of the most appropriate clustering algorithms in the design of
unsupervised classifiers

Prerequisites

Pre-requisites:

Entry-level courses in linear algebra and probability theory, such as those
normally offered in the corresponding 3-year Laurea course, are
necessary pre-requisites for this course.

Course unit content

Contents:

MODULE 1 (Bononi):
Basic probability refresher. Bayesian binary and M-ary classification.
MAP and Minimax classifiers. Performance and ROC.
Gaussian case and linear discriminant rules.
Bayesian estimation (regression).
Maximum likelihood, MMSE, MMAE estimators.
Linear suboptimal estimators. Supervised learning.
Generative versus discriminative approaches.
Plug-in learning.
Bayesian learning.
Minimum empirical risk learning. Nonparametric probability density estimation.
Linear data reduction for feature extraction.

MODULE 2 (Cagnoni):
Support Vector Machines.
Classifier evaluation techniques. Unsupervised classification and clustering.
K-means and Isodata algorithms Self-Organizing Maps
Learning Vector Quantization
Kohonen networks

Full programme

SYLLABUS (EVERY CLASS 2 HOURS)

MODULE 1:
Lec. 1. Introduction
- Problem statement and definitions
- Examples of machine learning problems
- Glossary of equivalent terms in Radar detecton theory, hypothesis testing and machine learning

Lec. 2. Probability refresher
- Axioms, conditional probability, total probability law, Bayes law, double conditioning, chain rule, independence and conditional independence of events.
- Discrete random variables (RV): expectation, conditional expectation. Pairs of RVs. Sum rule. Iterated expectation. Vectors of RVs. An extended example.

Lec. 3. Probability refresher
- Random vectors:
expectation, covariance and its properties, spectral decomposition of covariance matrix, whitening.
- Continuous RV.
Parallels with discrete RVs. Functions of RVs. Mixed RVs. Continuous random vectors.
- Appendix: differentiation rules for vectors and matrices.

Lec. 4.
- Gaussian RVs and their linear transformations. Mahalanobis distance.
Classification:
- Bayesian prediction: introduction, loss function, conditional risk, argmin/argmax rules
- Bayes classification: introduction

Lec. 5. Classification
- 0/1 loss -> maximum a posteriori (MAP) classifier. Binary MAP. Decision regions.
- Classifier performance.
- Likelihood ratio tests and receiver operating curve (ROC)
- Minimax rule

Lec. 6. Classification
- Binary Gaussian classification
- Homoscedastic case: linear discriminant analysis
- Heteroscedastic case: Bhattacharrya bound
- Bayes classification with discrete features
- Classification with missing data (composite hypothesis testing)

Lec. 7. Estimation
- Bayesian estimation: introduction
- Quadratic loss: minimum mean square error (MMSE) estimator = regression curve
- L1 loss: minimum mean absolute error (MMAE) estimator
- 0/1 loss: MAP estimator, and maximum likelihood (ML) in uniform prior.
- Regression for vector Gaussian case
- ML estimation for Gaussian observations

Lec. 8. Estimation
- ML for multinomial
- Conjugate priors in MAP estimation
- Estimation accuracy and ML properties, Cramer Rao bounds.
Suboptimal (non Bayesian) estimation:
- LMMSE estimation (linear regression)
- LMMSE derivation with LDU decomposition

Lec. 9. Estimation
- LMMSE examples
- Generalized linear regression
- Example: polynomial regression
- Sample LMMSE
- Generalized sample LMMSE.

Lec. 10. Learning
- Supervised learning: introduction
- Generative vs discriminatie approaches
- Example: logistic model
- Plug-in learning
ML fitting of logistic model: logistic regression
Example: handwritten digit recognition.
- Bayesian Learning

Lec. 11.
Learning:
- Empirical risk minimization
Nonparametric density estimation:
- Parzen window estimator
- kNN estimator

Lec. 12. linear data reduction
- Principal component analysis (PCA)
- Fisher linear classifier

MODULE 2:
Part 1: Introduction

Lesson 1: How to set up a machine learning experiment
Lesson 2: Learning-based classification

Part 2: Neural networks

Lesson 3: Introduction to neural networks
Lesson 4: Supervised and unsupervised learning
Lesson 5: Supervised learning: the Backpropagation algorithm
Lesson 6: Unsupervised learning and clustering
Lesson 7: Kohonen's self-organizing maps (SOM)
Lesson 8: Learning Vector Quantization

Part 3: Other learning-based classifiers
Lesson 9: Support Vector Machines

Labs:
Lab 1: WEKA
Lab 2: Classifiers in WEKA: Multi-Layer Perceptrons
Lab 3: SOM-based clustering

Bibliography

Suggested Rreading:

[1] C. W. Therrien, "Decision, estimation and classification" Wiley, 1989

[2] R. O. Duda, P. E. Hart, D. G. Stork, "Pattern classification", 2nd Ed., Wiley, 2001

[3] D. Barber "Bayesian Reasoning and Machine Learning" Cambridge University Press, 2012.

[4] C. M. Bishop "Pattern Recognition and Machine Learning", Springer, 2006.

[5] T. Hastie, R. Tibshirani, J.
Friedman, "The Elements of Statistical Learning: Data mining, inference, and prediction", Springer, 2008.

[6] Eibe Frank, Mark A. Hall, and Ian H. Witten (2016). The WEKA Workbench. Online Appendix for "Data Mining: Practical Machine Learning Tools and Techniques", Morgan Kaufmann, Fourth Edition, 2016.

Teaching methods

Teaching methods:

Classroom teaching, 42 hours. In-class problem solving, 6 hours.
Homeworks regularly assigned.

Assessment methods and criteria

Exams:

MODULE 1, Bononi:
Oral exam, to be scheduled on an individual basis. When ready, please contact the instructor by email at alberto.bononi[AT]unipr.
it and by specifying the requested date. The exam consists of solving some exercises and explaining theoretical details connected with them, for a total time of about 1 hour. You can bring your summary of important formulas in an A4 sheet to consult if you so wish.

MODULE 2, Cagnoni:
A practical project will be assigned, whose results will be presented and discussed by the student both as a written report and as an oral
presentation.

Other information

FURTHER INFORMATION:

1) Office Hours
Bononi: Monday 11:30-13:30 (Scientific Complex, Building 2,
floor 2, Room 2/19T).
Cagnoni: by appointment (Scientific Complex, Building 1, floor 2, email cagnoni[AT]ce.unipr.it).

2) Course website with teaching material (lecture notes, videolectures, solved exercises):

http://www.tlc.unipr.it/bononi/didattica/ML/ML.html

To get userid and password, please send an email to
alberto.bononi[AT]unipr.it from your account nome@studenti.unipr.it.

MACHINE LEARNING FOR PATTERN RECOGNITION cod. 1006077