FSDA Toolbox™ provides statisticians, engineers, scientists, researchers, financial analysts with a comprehensive set of tools to assess and understand their data. Flexible Statistics Data Analysis Toolbox™ software includes functions and interactive tools for analyzing and modeling data, learning and teaching statistics.
The Flexible Statistics Data Analysis Toolbox™ supports a set of routines to develop robust and efficient regression analysis. In addition, it offers a rich set interactive graphical tools which enable us to explore the connection in the various features of the different forward plots.
All Flexible Statistics Data Analysis Toolbox™ functions are written in the open MATLAB® language. This means that you can inspect the algorithms, modify the source code, and create your own custom functions.
FSDA is a software library in support of a
robust and efficient statistical analysis of data sets, ensuring an output
unaffected by anomalies in the provided data or deviations from model
assumptions. The tool:
Is especially useful in detecting in data potential anomalies (outliers), even when they occur in groups.
Can be used to identify sub-groups in heterogeneous data.
Extend functionalities in key statistical domains requiring robust analysis (cluster analysis, discriminant analysis, model selection, data transformation).
Integrate instruments for interactive data visualization and modern exploratory data analysis, designed to simplify the interpretation of the statistical results by the end user.
Provides statisticians, engineers, scientists, financial analysts a comprehensive set of tools to assess and understand their data.
Provides practitioners, students and teachers with functions and graphical tools for modeling complex data, learning and teaching statistics.
FSDA is developed for wide applicability. For its capacity to address problems focusing on anomalies in the data, it is expected that it will be used in applications such as anti-fraud, detection of computer network intrusions, e-commerce and credit cards frauds, customer and market segmentation, detection of spurious signals in data acquisition systems, in chemometrics (a wide field covering biochemistry, medicine, biology and chemical engineering), in issues related to the production of official statistics (e.g. imputation and data quality checks), and so on.
FSDA:
Works from the current MATLAB release up to 5 years older (for example if the current release R2022a we support MATLAB from release 2017a) and uses its Statistics Toolbox.
It has been tested on Microsoft as well as UNIX (Linux and MacOsX) platforms.
It can be installed in Windows platforms automatically, with a setup program that also opens inside the MATLAB editor files containing a series of examples of use in typical statistical problems.
It can be installed in UNIX platforms manually, from a compressed tar-file, or automatically from a shell script.
Regression analysis;
Multivariate analysis;
Data transformations in regression and multivariate applications;
Model selection;
Clustering;
Correspondence analysis
Interactive statistical visualization.
Graphical user interfaces that allow exploring the main interactive graphics features on plots generated by running different statistical functions.
Didactic material in the form of movies (with audio) that can be run in a browser connected to the Internet.
A rich collection of popular datasets provided in different data formats and fully documented.
A comprehensive documentation system:
FSDA is a MATLAB toolbox and it is therefore used within MATLAB. Typically, the user writes new scripts and/or functions including FSDA statements in an ordinary text file with '.m' extension, and executes the code from the standard MATLAB Command Window with a single command (the '.m' filename).
Many FSDA scripts automating steps of typical robust statistics tasks are available for trial in the head of each FSDA m-function and in example files.
As all MATLAB functions, FSDA functions accept input arguments and return output arguments, for example:
function [out] = FSReda(y,X,bsb,varargin)
For many functions the set of input/output parameters is so rich that it is neither convenient nor possible to treat them comprehensively in these introductory pages. Details on a specific option is retrieved from the MATLAB Command Window by typing
docsearchFS(file_name)
The order with which the optional input parameters are set does not matter.
Typically, even a well-trained practitioner will make use of few of the optional parameters available. On the other hand, a researcher will have the possibility to experiment with many internal variables controlled by optional parameters, without being forced to touch the source codes.
The use of very flexible and thus elaborated options is simplified by the adoption of data types of increasing complexity. For example, option databrush, which controls the interactive brushing features of the FSDA dynamic statistical visualization tools, can be simply neglected if the purpose is to produce traditional static plots, it can be set to a scalar (e.g. databrush=1) to make a single data selection, or it can be set to a MATLAB structure (e.g. databrush.persist='on') to make an indefinite number of selections. In general, when an option becomes a structure, the list of possible fields will be automatically set to default values and the user will only have to set what is of interest.
The output parameters are dealt with by the same principle: when a function generates a lot of information, this is organized in an output structure so that the user can extract only fields of major interest.
A second modality to use FSDA is through Graphical User Interfaces, to perform
tasks interactively through controls such as buttons and sliders. The user can
develop GUIs for FSDA using the standard MATLAB instruments. Few GUIs are
integrated in the FSDA distribution, but they are mainly designed for
demonstration purpose.
Almost all FSDA functions are in open MATLAB language. The use of mex files
obtained from the compilation of codes written in C or other languages is
minimal and always accompanied by the corresponding MATLAB function. This is to
facilitate the understanding of the algorithms implemented and encourage the user
to enrich the toolbox with new functions.
Of course each new function should be documented. It is customary for a MATLAB user to document new functions in the head of the .m file. Only rarely the user is prepared to duplicate the effort and work on the corresponding .html documentation file. This is understandable, since the complete integration of new .html files in the standard MATLAB documentation system is not facilitated by built-in tools. In order to help the user in this time consuming but valuable task, FSDA provides some tools, which should be used in the following order:
makecontentsfileFS.m.
This function generates personalized
.contents files of the functions given in a folder and selected
subfolders. The files to include inside the contents can be filtered according to their filename or content
makecontentsFS
also returns an output structure containing the list of the files together
with information on their location, creation dates, and so on.
The .m
function publishFSallFiles calls routine
publishFS for the list of files generated by
makecontentsFS
publishFunctionAlpha.m and
publishFunctionCate.m. These functions generate the categorical and
alphabetical index pages of the documenation system, starting from the
list generated by
makecontentsfileFS. An option of
publishFunctionAlpha.m
enables us to obain a .txt file (named
function-alpha.txt) which contains the names of all files
present indexed by
makecontentsfileFS separated by commas. In our HTML page
automatically created by our parser publishFS we have included a
javascript which calls function-alpha.txt and automatically includes a
navigation bar to previous and next file in alphabetical order.
publishBibliography.m
This functions generates page bibliography
starting from the output of routine publishFSallFiles.