The Robust Statistics Area - Ro.S.A. is committed to share new ideas and propose new statistical methods with emphasis to academic and industrial applications.

FORWARD SEARCH

It is a widely accepted in many fields that outliers can affect results and conclusions. Such an issue becomes even more relevant when the number of outliers is large and when they mask each others, so that old fashion methods become unreliable. Recently to overcome the masking effect a new dynamic technique has been introduced: the Forward Search (FS). FS is unaffected to outliers even when their number is large and their structure is involved. Pioneer works where on non-linear regression models and generalised linear models, but more recently many multivariate models are encapsulated into the FS framework. It is under investigation and development the generalisation to more structured data with either spatial or temporal dependence.

ROBUST CLASSIFICATION

Clustering or Cluster Analysis is a set of statistical tools designed to achieve a classification of a population into sub-populations such that similar objects belong to the same cluster. For a given set of predictors or factors the clustering aim at create clusters targets well diversified. One of the most widely used methods in data mining is the k-means, which ignores that predictors might be correlated, is weak when outliers are present and tend to identify target clusters with similar size. Ro.S.A. has contributed to create new robust classification schemes which overcome those issues, so that the new method is resistant to outliers. Additionally correlation can be included in the model and, as a byproduct, target clusters might be of different size, whatever the contaminatio to the data.

TIME SERIES

The quick dynamic change in economic and financial markets make their modelling challenging and their predictions difficult. In Ro.S.A. we use time series models to such an aim, and provide new methods engineered to understand the main causes that created temporal fluctuations. It is key to properly identify seasonal and cyclical pattern, whose identification might be improved by using outliers-free methods. The natural methodological framework is that of state-space models, of which the Kalman filter is the building brick. State-space models are robustified to handle outliers and other hidden structure. Approximate state-space robust models are needed, when non-linear features are to be considered.

BIG DATA

Big Data refers to a set of data so large and complex that many standard data processing applications might be unusable. With big data capturing relevant information becomes troublesome and checks of quality of the data equally time-consuming. Efficient robust statistical methods can quickly detect hidden structure and highlight outliers dynamically and automatically.