Getting Started with FSDA Toolbox

Analyze and model data using robust statistics estimators

Key Features

FSDA Toolbox™ provides statisticians, engineers, scientists, researchers, financial analysts with a comprehensive set of tools to assess and understand their data. Flexible Statistics Data Analysis Toolbox™ software includes functions and interactive tools for analyzing and modeling data, learning and teaching statistics.

The Flexible Statistics Data Analysis Toolbox™ supports a set of routines to develop robust and efficient regression analysis. In addition, it offers a rich set interactive graphical tools which enable us to explore the connection in the various features of the different forward plots.

All Flexible Statistics Data Analysis Toolbox™ functions are written in the open MATLAB® language. This means that you can inspect the algorithms, modify the source code, and create your own custom functions.

Analyze and model data using flexible, robust statistical methods

FSDA is a software library in support of a robust and efficient statistical analysis of data sets, ensuring an output unaffected by anomalies in the provided data or deviations from model assumptions. The tool:

FSDA is developed for wide applicability. For its capacity to address problems focusing on anomalies in the data, it is expected that it will be used in applications such as anti-fraud, detection of computer network intrusions, e-commerce and credit cards frauds, customer and market segmentation, detection of spurious signals in data acquisition systems, in chemometrics (a wide field covering biochemistry, medicine, biology and chemical engineering), in issues related to the production of official statistics (e.g. imputation and data quality checks), and so on.

System requirements

FSDA:

Main features include:

  1. Different categories of robust statistical functions, covering:
  1. Graphical user interfaces that allow exploring the main interactive graphics features on plots generated by running different statistical functions.

  2. Didactic material in the form of movies (with audio) that can be run in a browser connected to the Internet.

  3. A rich collection of popular datasets provided in different data formats and fully documented.

  4. A comprehensive documentation system:

Description of use

FSDA is a MATLAB toolbox and it is therefore used within MATLAB. Typically, the user writes new scripts and/or functions including FSDA statements in an ordinary text file with '.m' extension, and executes the code from the standard MATLAB Command Window with a single command (the '.m' filename).

Many FSDA scripts automating steps of typical robust statistics tasks are available for trial in the head of each FSDA m-function and in example files.

As all MATLAB functions, FSDA functions accept input arguments and return output arguments, for example:

function [out] = FSReda(y,X,bsb,varargin)

For many functions the set of input/output parameters is so rich that it is neither convenient nor possible to treat them comprehensively in these introductory pages. Details on a specific option is retrieved from the MATLAB Command Window by typing

docsearchFS(file_name)

The order with which the optional input parameters are set does not matter.

Typically, even a well-trained practitioner will make use of few of the optional parameters available. On the other hand, a researcher will have the possibility to experiment with many internal variables controlled by optional parameters, without being forced to touch the source codes.

The use of very flexible and thus elaborated options is simplified by the adoption of data types of increasing complexity. For example, option databrush, which controls the interactive brushing features of the FSDA dynamic statistical visualization tools, can be simply neglected if the purpose is to produce traditional static plots, it can be set to a scalar (e.g. databrush=1) to make a single data selection, or it can be set to a MATLAB structure (e.g. databrush.persist='on') to make an indefinite number of selections. In general, when an option becomes a structure, the list of possible fields will be automatically set to default values and the user will only have to set what is of interest.

The output parameters are dealt with by the same principle: when a function generates a lot of information, this is organized in an output structure so that the user can extract only fields of major interest.

A second modality to use FSDA is through Graphical User Interfaces, to perform tasks interactively through controls such as buttons and sliders. The user can develop GUIs for FSDA using the standard MATLAB instruments. Few GUIs are integrated in the FSDA distribution, but they are mainly designed for demonstration purpose.

Generate the documentation of your own FSDA functions

Almost all FSDA functions are in open MATLAB language. The use of mex files obtained from the compilation of codes written in C or other languages is minimal and always accompanied by the corresponding MATLAB function. This is to facilitate the understanding of the algorithms implemented and encourage the user to enrich the toolbox with new functions.

Of course each new function should be documented. It is customary for a MATLAB user to document new functions in the head of the .m file. Only rarely the user is prepared to duplicate the effort and work on the corresponding .html documentation file. This is understandable, since the complete integration of new .html files in the standard MATLAB documentation system is not facilitated by built-in tools. In order to help the user in this time consuming but valuable task, FSDA provides some tools, which should be used in the following order:

  1. publishFS. This is a parser that generates the .html documentation page of a structured .m file.
  2. makecontentsfileFS.m. This function generates personalized .contents files of the functions given in a folder and selected subfolders.  The files to include inside the contents can be filtered according to their filename or content makecontentsFS also returns an output structure containing the list of the files together with information on their location, creation dates, and so on.
    The .m function publishFSallFiles calls routine publishFS for the list of files generated by makecontentsFS

  3. publishFunctionAlpha.m and publishFunctionCate.m. These functions generate the categorical and alphabetical index pages of the documenation system, starting from the list generated by makecontentsfileFS. An option of publishFunctionAlpha.m enables us to obain a .txt file (named function-alpha.txt) which contains the names of all files present indexed by makecontentsfileFS separated by commas. In our HTML page automatically created by our parser publishFS we have included a javascript which calls function-alpha.txt and automatically includes a navigation bar to previous and next file in alphabetical order.

  4. publishBibliography.m This functions generates page bibliography starting from the output of routine publishFSallFiles.

Do you want to contribute to the FSDA project?

If you arrive at the point of writing new functions and documentation pages compliant with the FSDA philosophy, it means that you have enough energies to take part in the FSDA project. In this case, please check our websites (https://github.com/UniprJRC/FSDA  and http://rosa.unipr.it /) for open projects and feel free to contact us at fsda@unipr.it.