Robust Statistics for Data Reduction

Abstract:   We will briefly introduce the main principles and ideas in robust statistics, focusing on trimming methods. The working example will be that of estimation of location and scatter inmultidimensional problems, together with outlier identification. We will then discuss some methods for robust clustering based on impartial trimming and snipping. A simple robust method for dimensionality reduction will be finally discussed. Illustrations will be based on the R software and some contributed extension packages. Tentative schedule: Introduction to robust inference. Concepts of: masking, swamping, breakdown point, Tukey-Huber contamination, entry-wise contamination. Estimation of location and scatter based on the Minimum Covariance Determinant. The fastMCD algorithm. Outlier identification. Robust clustering: trimmed $k$-means, snipped $k$-means. The tclust and sclust algorithms.Selecting the trimming level and number of clusters though the classification trimmed likelihood curves. Plug-in methods for dimension reduction. Brief overview of most recent contributions and venues for further work.  The course will be based on the book: Farcomeni, A. and Greco, L. (2015) Robust Methods for Data Reduction, Chapman & Hall/CRC Press