DECISION-MAKING STATISTICAL METHODS A
cod. 18196

Academic year 2008/09
1° year of course - Second semester
Professor
Academic discipline
Probabilità e statistica matematica (MAT/06)
Field
Matematica, informatica e statistica
Type of training activity
Basic
36 hours
of face-to-face activities
4 credits
hub:
course unit
in - - -

Learning objectives

Part one is a thorought and advanced revision of the topics presentedin the basic Statistics course. We introduce new classes of tests todeal with more complex and realistic settings. Part two is anot-too-quick glance of today's analysis and mangement techniques thatbelongs to statistics and explorative data analysis.

Prerequisites

<br />Prerequisites: Statistica, Analisi AB, Analisi C.

Course unit content

<br />PART ONE: BASIC MULTIVARIATE TOOLS.<br />Revision on random variables and statistical inference.<br />Classical Z, T and F tests for comparing parameters for two normal populations.<br />Adaptation and independence tests (Fisher-Irwin, chi-square, contingency tables).<br />Regression: coefficients determination (linear and multilinear models,linearization; coefficient of determination, analysis of residuals,weighted min-squares); inference on coefficients (T and F tests)<br />Analysis of variance (one-way, two-ways and with interactions).<br /><br />SECOND HALF: EXPLORATIVE DATA ANALYSIS <br />Graphical representation of very large and/or high-dimesional data sets(multivariate gaussian distribution, correlation matrix, eigenvaluesand eigenvectors)<br />Model adaptation (kernel functions, chi-squared test, Kolmogorov-Smirnov test) <br />Cluster analysis (distances; hierarchical tree clustering, linkage;k-means algorithms; EM algorithms, mixtures of measures, bayesianclassification).<br />Factor analysis (principal component analysis, common factor analysis,variables reduction, factor interpretation, factors rotations).<br />Discriminant function analysis (Fisher linear methods, variables reduction).<br />Neural networks (multilayer perceptron).<br />Overfitting and overlearning: when the model does not fit the population but the sample.<br />Non-parametric tests (signs, ranked signs, Wilcoxon's, for the independence of the sample).<br />Bayesian parametric tests (overview). <br />

Full programme

- - -

Bibliography

S. Ross - Introduction to probability and statistics for engineering and science<br />Hand, Mannila, Smyth - Principles of data mining.<br /><br />

Teaching methods

Theory lessons are supported by practice lessons with the PC on the use of a spreadsheet to solve the problems of statistics.<br /><br />The exam is in two parts.<br />The first part, at the computer, consists of a) multiple answer questions on the foundamental concepts of statistics; and b) some problems to solve with MS Excel on the first half of the course.<br />The second part is a written examination, in the form of a composition, on how are used the advanced techniques (both first and second part of the course) for given datasets.<br /><br />

Assessment methods and criteria

- - -

Other information

- - -