grpstatsFS

grpstatsFS calls grpstats and reshapes the output in a much better way

Syntax

  • statTable=grpstatsFS(TBL, groupvars, whichstats)example
  • statTable=grpstatsFS(TBL, groupvars, whichstats,Name,Value)example

Description

grpstatsFS calls grpstats, but shows the output in much better way. The output of grpstatsFS is a table with a number of rows equal to the number of variables for which statistics are computed. The number of columns of this table is equal to the number of statistics which are computed. By default, the statistics which are computed are the two (non robust and robust) indexes of location, (mean and median) the two (non robust and robust) indexes of spread (standard deviation and scaled MAD) and the two (non robust and robust) indexes of skewness.

The robust index of skewness is the medcouple. The scaled MAD is defined as 1.4826(med|x-med(x)|).

In presence of a grouping variable the number of rows of the output table remains the same, but the number of columns is equal to the number of statistics multiplied by the number of groups. With option 'OutputFormat' 'nested' which is similar to option 'OutputFormat' in function pivot, it is possible to display the results using nested tables.

example

statTable =grpstatsFS(TBL, groupvars, whichstats) grpstatsFS with just one input argument.

example

statTable =grpstatsFS(TBL, groupvars, whichstats, Name, Value) grpstatsFS with second input the grouping variable.

Examples

expand all

  • grpstatsFS with just one input argument.
  • Load a table

    load citiesItaly.mat
    % Compute mean, median,  std, MAD, skewness and medcouple
    % for the 7 variables of the input table citiesItaly
    TBL=grpstatsFS(citiesItaly);
    disp(TBL)
                       mean     median     std       MAD      skewness    medcouple
                      ______    ______    ______    ______    ________    _________
    
        addedval       18096     18469    4941.6    6001.8     0.1079      -0.12451
        depos         7769.1    7661.1    2841.4    3150.5     1.0734      -0.03612
        pensions       10044    9975.2    1230.8    1170.6    0.31692      0.089163
        unemploy      10.173      6.42    7.8789    4.8778     1.0687       0.61846
        export         23.11     21.53    15.642    16.457     0.5797      0.066667
        bankrup       30.467     29.62     12.11     11.49     1.0067     -0.010969
        billsoverd    44.614      40.6    22.783    19.496    0.98031        0.1586
    
    

  • grpstatsFS with second input the grouping variable.
  • load citiesItaly.mat
    % The first 46 rows are referred to provinces located in northern Italy and
    % the remaining in centre-south Italy.
    zone=[repelem("N",46) repelem("CS",57)]';
    % Add zone to citiesItaly
    citiesItaly.zone=zone;
    TBL=grpstatsFS(citiesItaly,"zone");
    % First 6 columns are referred to province of the north.
    % The remaning columns to the other provinces.
    disp(TBL)
                      meanN     medianN     stdN      MADN     skewnessN    medcoupleN    meanCS    medianCS    stdCS     MADCS     skewnessCS    medcoupleCS
                      ______    _______    ______    ______    _________    __________    ______    ________    ______    ______    __________    ___________
    
        addedval       22071     21936     2926.4    2136.5     0.78292     -0.024878      14888      14049     3760.7    3500.3     0.83573        0.23097  
        depos         9614.6    9407.5     2214.3    1052.6      2.9736     -0.054686     6279.7     5506.5     2389.6    1810.8      1.5434        0.37286  
        pensions       10798     10658     891.71    716.05      0.8251       0.14446     9434.9     9465.1     1129.5    1103.6     0.95725       -0.15654  
        unemploy      4.4713      4.34     1.6834    1.7643     0.81049     0.0022075     14.774      15.25     7.9086    10.126     0.35304       -0.10805  
        export        30.366    29.025     13.042     13.84     0.48928      0.088586     17.255      12.94     15.193    14.366      1.1879        0.29376  
        bankrup       27.855    27.745     10.451    10.215     0.73877      -0.10754     32.575      31.64     13.009    12.691      1.0051       0.010132  
        billsoverd    31.775      30.2     14.368    10.156      1.2639      0.056604     54.975      52.34     23.127    19.422     0.68542       0.069632  
    
    

    Related Examples

    expand all

  • Example of call to grpstatsFS with personalized statistics.
  • The second element is empty, that is there is no grouping variable.

    load citiesItaly.mat
    % Just compute mean and median
    stats=["mean" "median"];
    TBL=grpstatsFS(citiesItaly,[],stats);
    disp(TBL)
                       mean     median
                      ______    ______
    
        addedval       18096     18469
        depos         7769.1    7661.1
        pensions       10044    9975.2
        unemploy      10.173      6.42
        export         23.11     21.53
        bankrup       30.467     29.62
        billsoverd    44.614      40.6
    
    

  • Example of call to grpstatsFS with function handles.
  • load citiesItaly.mat
    % multiple types of summary statistics specified as a cell
    stats={"mean","std",@skewness};
    TBL=grpstatsFS(citiesItaly,[],stats);
    disp(TBL)
                       mean      std      skewness
                      ______    ______    ________
    
        addedval       18096    4941.6     0.1079 
        depos         7769.1    2841.4     1.0734 
        pensions       10044    1230.8    0.31692 
        unemploy      10.173    7.8789     1.0687 
        export         23.11    15.642     0.5797 
        bankrup       30.467     12.11     1.0067 
        billsoverd    44.614    22.783    0.98031 
    
    

  • Example of call to grpstatsFS to create conf int for the mean.
  • load citiesItaly.mat
    % Note that in this case meanci has in output two columns
    stats={"meanci" 'mean'};
    % Confidence interval for the sample means
    TBL=grpstatsFS(citiesItaly,[],stats);
    disp(TBL)
                      meanCIinf    meanCIsup     mean 
                      _________    _________    ______
    
        addedval        17131        19062       18096
        depos          7213.7       8324.4      7769.1
        pensions       9803.2        10284       10044
        unemploy       8.6327       11.712      10.173
        export         20.053       26.167       23.11
        bankrup        28.101       32.834      30.467
        billsoverd     40.161       49.066      44.614
    
    

  • Example of call to grpstatsFS to create conf int for the mean with groups.
  • load citiesItaly.mat
    % Note that in this case meanci has in output two columns
    stats={"meanci" 'mean'};
    % The first 46 rows are referred to provinces located in northern Italy and
    % the remaining in centre-south Italy.
    zone=[repelem("N",46) repelem("CS",57)]';
    % Add zone to citiesItaly
    citiesItaly.zone=zone;
    % Confidence interval for the sample means separated for the 2 groups
    TBL=grpstatsFS(citiesItaly,"zone",stats);
    disp(TBL)
                      meanCIinfN    meanCIsupN    meanN     meanCIinfCS    meanCIsupCS    meanCS
                      __________    __________    ______    ___________    ___________    ______
    
        addedval         21202         22940       22071       13891          15886        14888
        depos             8957         10272      9614.6      5645.6         6913.7       6279.7
        pensions         10533         11063       10798      9135.3         9734.6       9434.9
        unemploy        3.9714        4.9712      4.4713      12.675         16.872       14.774
        export          26.493        34.239      30.366      13.224         21.286       17.255
        bankrup         24.752        30.959      27.855      29.124         36.027       32.575
        billsoverd      27.508        36.042      31.775      48.839         61.111       54.975
    
    

  • Example of use of option Alpha.
  • load citiesItaly.mat
    % Note that in this case meanci has in output two columns
    stats={"meanci" 'mean'};
    % The first 46 rows are referred to provinces located in northern Italy and
    % the remaining in centre-south Italy.
    zone=[repelem("N",46) repelem("CS",57)]';
    % Add zone to citiesItaly
    citiesItaly.zone=zone;
    % 99 per cent confidence intervals for the sample means separated for the 2 groups
    TBL=grpstatsFS(citiesItaly,"zone",stats,'Alpha',0.01);
    disp(TBL)
                      meanCIinfN    meanCIsupN    meanN     meanCIinfCS    meanCIsupCS    meanCS
                      __________    __________    ______    ___________    ___________    ______
    
        addedval         20911         23232       22071       13560          16217        14888
        depos           8736.5         10493      9614.6      5435.7         7123.6       6279.7
        pensions         10445         11152       10798        9036         9833.9       9434.9
        unemploy        3.8037        5.1389      4.4713       11.98         17.567       14.774
        export          25.194        35.538      30.366      11.889         22.621       17.255
        bankrup         23.711            32      27.855      27.981          37.17       32.575
        billsoverd      26.077        37.473      31.775      46.807         63.143       54.975
    
    

  • Example of the use of option DataVars.
  • load citiesItaly.mat
    % Note that in this case meanci has in output two columns
    stats={"meanci" 'mean'};
    % The first 46 rows are referred to provinces located in northern Italy and
    % the remaining in centre-south Italy.
    zone=[repelem("N",46) repelem("CS",57)]';
    % Add zone to citiesItaly
    citiesItaly.zone=zone;
    % Confidence interval for the sample means separated for the 2 groups
    TBL=grpstatsFS(citiesItaly,"zone",stats,'DataVars',["addedval" "unemploy"]);
    disp(TBL)
                    meanCIinfN    meanCIsupN    meanN     meanCIinfCS    meanCIsupCS    meanCS
                    __________    __________    ______    ___________    ___________    ______
    
        addedval       21202         22940       22071       13891          15886        14888
        unemploy      3.9714        4.9712      4.4713      12.675         16.872       14.774
    
    

  • Example of the use of option DataVars with VarNames.
  • load citiesItaly.mat
    % Note that in this case meanci has in output two columns
    stats={"median" 'mean'};
    TBL=grpstatsFS(citiesItaly,[],stats, ...
    'DataVars',[1 2 5],'VarNames', ...
    ["Robust location" "Non robust location"]);
    disp(TBL)
                    Robust location    Non robust location
                    _______________    ___________________
    
        addedval         18469                18096       
        depos           7661.1               7769.1       
        export           21.53                23.11       
    
    

  • Example of the use of option DataVars with VarNames and grouping variable.
  • load citiesItaly.mat
    % Note that in this case meanci has in output two columns
    stats={"meanci" 'mean'};
    % The first 46 rows are referred to provinces located in northern Italy and
    % the remaining in centre-south Italy.
    zone=[repelem("N",46) repelem("CS",57)]';
    % Add zone to citiesItaly
    citiesItaly.zone=zone;
    % Confidence interval for the sample means separated for the 2 groups
    TBL=grpstatsFS(citiesItaly,"zone",stats, ...
    'DataVars',["addedval" "unemploy"],'VarNames', ...
    ["Mean: lower confidence interval" "Mean: upper confidence interval" "Sample mean"]);
    disp(TBL)
                    Mean: lower confidence intervalN    Mean: upper confidence intervalN    Sample meanN    Mean: lower confidence intervalCS    Mean: upper confidence intervalCS    Sample meanCS
                    ________________________________    ________________________________    ____________    _________________________________    _________________________________    _____________
    
        addedval                  21202                               22940                     22071                     13891                                15886                      14888    
        unemploy                 3.9714                              4.9712                    4.4713                    12.675                               16.872                     14.774    
    
    

  • Example of the use of option OutputFormat.
  • load citiesItaly.mat
    % Two measures of location and dispersion
    stats={'mean' 'median' 'std' @(x)1.4826*mad(x,1)};
    % The first 46 rows are referred to provinces located in northern Italy and
    % the remaining in centre-south Italy.
    zone=[repelem("N",46) repelem("CS",57)]';
    % Add zone to citiesItaly
    citiesItaly.zone=zone;
    % Requested statistics for the 2 groups using nested tables
    TBL=grpstatsFS(citiesItaly,"zone",stats,'OutputFormat','nested');
    format bank
    disp(TBL)
                              mean                   median                  std            (x)1.4826*mad(x,1)
                      ____________________    ____________________    __________________    __________________
    
                         N           CS          N           CS          N         CS          N         CS   
                      ________    ________    ________    ________    _______    _______    _______    _______
                                                                                                              
        addedval      22071.47    14888.47    21936.30    14049.16    2926.39    3760.72    2136.45    3500.31
        depos          9614.60     6279.66     9407.50     5506.50    2214.32    2389.55    1052.64    1810.80
        pensions      10798.16     9434.94    10658.22     9465.15     891.71    1129.48     716.05    1103.65
        unemploy          4.47       14.77        4.34       15.25       1.68       7.91       1.76      10.13
        export           30.37       17.25       29.02       12.94      13.04      15.19      13.84      14.37
        bankrup          27.86       32.58       27.75       31.64      10.45      13.01      10.22      12.69
        billsoverd       31.77       54.97       30.20       52.34      14.37      23.13      10.16      19.42
    
    

    Input Arguments

    expand all

    TBL — Input data. Table.

    Table containing n observations on p variables.

    Rows of TBL represent observations, and columns represent variables. If it necessary to compute the statistics for subgroups, TBL must include at least one grouping variable, which you specify using groupvars.

    Data Types: single| double

    groupvars — grouping variable. Identifiers for the grouping variables in input TBL.

    If groupvars is [] then the output refers to the overall sample. For additional information on groupvars see the help of grpstats. For example Example - 'groupvars',2

    Data Types: character vector | string array | cell array of character vectors | vector of positive integers | logical vector | []

    whichstats — Types of summary statistics. Name of the statistics which have to be computed.

    For additional information on whichstats see the help of grpstats. If whichstats is empty or it is not specified, the summary statistics which are computed are ["mean" "median" "std" "MAD" "skewness" "medcouple"];

    Example - ["mean" "std"]

    Data Types: character vector | string array | function handle | cell array of character vectors or function handles.

    Name-Value Pair Arguments

    Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

    Example: 'Alpha',0.01 , 'DataVars',[2 4] , 'VarNames',["location" "robust location"]; , 'OutputFormat','nested';

    Alpha —Significance level.scalar in [0 1).

    Significance level for confidence and prediction intervals.

    For additional information on Alpha see the help of grpstats.

    Example: 'Alpha',0.01

    Data Types: double

    DataVars —Table variables for which to compute summary statistics.for additional information on Alpha see the help of grpstats.

    Example: 'DataVars',[2 4]

    Data Types: character vector | string array | cell array of character vectors | vector of positive integers | logical vector

    VarNames —Variable names for output table.cell array | characters | string array.

    Note that the length of VarNames must be equal to the number of statistics which are computed. Variable (column) names for the output table statTable, specified as a string array or a cell array of character vectors. By default, grpstatsFS removes the @ if it is present in the name of the statistic. In presence of a grouping variable grpstatsFS appends the name corresponding to each category of the groups.

    Example: 'VarNames',["location" "robust location"];

    Data Types: character vector | string array | cell array of character vectors | vector of positive integers | logical vector

    OutputFormat —Column hierarchy output format.'flat' (default) | 'nested' In presence of one | more grouping variable, this option specifies whether the label of each statistic must be concatenated with the categories of the grouping variables | the overall output table must contain nested tables each referred to a different statistic.

    Example: 'OutputFormat','nested';

    Data Types: character | string

    Output Arguments

    expand all

    statTable —Summary statistics for the table input TBL. table with p rows. Table

    The rows are referred to the variables of the input table and the columns to the requested statistics.

    The number of columns of statTable is equal to the number of requested statistics multiplied by the number of groups.

    References

    This page has been automatically generated by our routine publishFS