CorAna performs correspondence analysis
load smoke [N,~,~,labels] =crosstab(smoke{:,1},smoke{:,2}); [I,J]=size(N); if verLessThan('matlab','8.2.0')==0 % Contingency table is supplied to CorAna in table format Ntable=array2table(N,'RowNames',labels(1:I,1),'VariableNames',labels(1:J,2)) out=CorAna(Ntable); else out=CorAna(N); end
Input is the contingency table, labels for rows and columns are supplied.
% Data are read from the txt file load('smoke.txt') labels_rows= {'Senior-Managers' 'Junior-Managers' 'Senior-Employees' 'Junior-Employees' 'Secretaries'}; labels_columns= {'None' 'Light' 'Medium' 'Heavy'}; N=crosstab(smoke(:,1),smoke(:,2)); out=CorAna(N,'Lr',labels_rows,'Lc',labels_columns);
Summary Singular_value Inertia Accounted_for Cumulative ______________ __________ _____________ __________ dim_1 0.27342 0.074759 0.87756 0.87756 dim_2 0.10009 0.010017 0.11759 0.99515 dim_3 0.020337 0.00041357 0.0048547 1 ROW POINTS Results for dimension: 1 Scores CntrbPnt2In CntrbDim2In _________ ___________ ___________ Senior_Managers -0.065768 0.0032977 0.092232 Junior_Managers 0.25896 0.083659 0.5264 Senior_Employees -0.38059 0.51201 0.99903 Junior_Employees 0.23295 0.33097 0.94193 Secretaries -0.20109 0.070064 0.86535 Results for dimension: 2 Scores CntrbPnt2In CntrbDim2In ________ ___________ ___________ Senior_Managers -0.19374 0.21356 0.80034 Junior_Managers -0.2433 0.55115 0.46468 Senior_Employees -0.01066 0.0029976 0.00078372 Junior_Employees 0.057744 0.15177 0.057876 Secretaries 0.078911 0.080522 0.13326 COLUMN POINTS Results for dimension: 1 Scores CntrbPnt2In CntrbDim2In ________ ___________ ___________ None -0.39331 0.654 0.99402 Light 0.099456 0.03085 0.32673 Medium 0.19632 0.16562 0.98185 Heavy 0.29378 0.14954 0.6844 Results for dimension: 2 Scores CntrbPnt2In CntrbDim2In _________ ___________ ___________ None -0.030492 0.029336 0.0059745 Light 0.14106 0.46317 0.65729 Medium 0.0073591 0.0017368 0.0013796 Heavy -0.19777 0.50575 0.31015 ----------------------------------------------------------- Overview ROW POINTS Mass Score_1 Score_2 Inertia CntrbPnt2In_1 CntrbPnt2In_2 CntrbDim2In_1 CntrbDim2In_2 ________ _________ ________ _________ _____________ _____________ _____________ _____________ Senior_Managers 0.056995 -0.065768 -0.19374 0.0026729 0.0032977 0.21356 0.092232 0.80034 Junior_Managers 0.093264 0.25896 -0.2433 0.011881 0.083659 0.55115 0.5264 0.46468 Senior_Employees 0.26425 -0.38059 -0.01066 0.038314 0.51201 0.0029976 0.99903 0.00078372 Junior_Employees 0.45596 0.23295 0.057744 0.026269 0.33097 0.15177 0.94193 0.057876 Secretaries 0.12953 -0.20109 0.078911 0.006053 0.070064 0.080522 0.86535 0.13326 Overview COLUMN POINTS Mass Score_1 Score_2 Inertia CntrbPnt2In_1 CntrbPnt2In_2 CntrbDim2In_1 CntrbDim2In_2 _______ ________ _________ _________ _____________ _____________ _____________ _____________ None 0.31606 -0.39331 -0.030492 0.049186 0.654 0.029336 0.99402 0.0059745 Light 0.23316 0.099456 0.14106 0.0070588 0.03085 0.46317 0.32673 0.65729 Medium 0.32124 0.19632 0.0073591 0.01261 0.16562 0.0017368 0.98185 0.0013796 Heavy 0.12953 0.29378 -0.19777 0.016335 0.14954 0.50575 0.6844 0.31015 ----------------------------------------------------------- Legend Row scores in principal coordinates Column scores in principal coordinates CntrbPnt2In = relative contribution of points to explain total Inertia of the latent dimension The sum of the numbers in a column is equal to 1 CntrbDim2In = relative contribution of latent dimension to explain total Inertia of a point CntrbDim2In_1+CntrbDim2In_2+...+CntrbDim2In_K=1
Children data Active rows = 1:15 Active columns = 1:5
N=[51 64 32 29 17 59 66 70; 53 90 78 75 22 115 117 86; 71 111 50 40 11 79 88 177; 1 7 5 5 4 9 8 5; 7 11 4 3 2 2 17 18; 7 13 12 11 11 18 19 17; 21 37 14 26 9 14 34 61; 12 35 19 6 7 21 30 28; 10 7 7 3 1 8 12 8; 4 7 7 6 2 7 6 13; 8 22 7 10 5 10 27 17; 25 45 38 38 13 48 59 52; 18 27 20 19 9 13 29 53; 35 61 29 14 12 30 63 58; 2 4 3 1 4 nan nan nan ; 2 8 2 5 2 nan nan nan; 1 5 4 6 3 nan nan nan; 3 3 1 3 4 nan nan nan]; % rowslab = cell containing row labels rowslab={'money','future','unemployment','circumstances',... 'hard','economic','egoism','employment','finances',... 'war','housing','fear','health','work','comfort','disagreement',... 'world','to_live'}; % colslab = cell containing column labels colslab={'unqualified','cep','bepc','high_school_diploma','university',... 'thirty','fifty','more_fifty'}; tableN=array2table(N,'VariableNames',colslab,'RowNames',rowslab); % Extract just active rows and active columns Nactive=tableN(1:14,1:5); % Define tables containing supplementary rows and supplementary cols Nsupr=tableN(15:18,1:5); Nsupc=tableN(1:14,6:8); Sup=struct; Sup.r=Nsupr; Sup.c=Nsupc; out=CorAna(Nactive,'Sup',Sup);
Summary Singular_value Inertia Accounted_for Cumulative ______________ _________ _____________ __________ dim_1 0.18815 0.035402 0.57043 0.57043 dim_2 0.11452 0.013115 0.21132 0.78175 dim_3 0.085447 0.0073011 0.11764 0.89939 dim_4 0.079018 0.0062439 0.10061 1 ROW POINTS Results for dimension: 1 Scores CntrbPnt2In CntrbDim2In _________ ___________ ___________ money -0.11527 0.045499 0.42845 future 0.17645 0.17567 0.71562 unemployment -0.21223 0.22616 0.87492 circumstances 0.40092 0.062745 0.58397 hard -0.24998 0.029938 0.88369 economic 0.35396 0.12005 0.48362 egoism 0.059889 0.0068096 0.073339 employment -0.13675 0.026215 0.1643 finances -0.237 0.027904 0.27623 war 0.21682 0.021688 0.74907 housing -0.006681 4.1183e-05 0.00072894 fear 0.20335 0.11666 0.90069 health 0.11165 0.020571 0.79911 work -0.21168 0.12005 0.75402 Results for dimension: 2 Scores CntrbPnt2In CntrbDim2In __________ ___________ ___________ money -0.020046 0.0037146 0.012958 future 0.097863 0.14587 0.22013 unemployment 0.070718 0.067786 0.097145 circumstances -0.33099 0.11544 0.398 hard -0.06765 0.0059184 0.064717 economic -0.32072 0.26604 0.39705 egoism 0.025667 0.0033763 0.013471 employment -0.21539 0.17555 0.4076 finances 0.20598 0.056902 0.20867 war 0.074663 0.0069419 0.088821 housing -0.12824 0.04096 0.26858 fear 0.058068 0.025678 0.073446 health -0.0042912 8.2025e-05 0.0011804 work -0.10888 0.085745 0.19951 COLUMN POINTS Results for dimension: 1 Scores CntrbPnt2In CntrbDim2In ________ ___________ ___________ unqualified -0.20932 0.2511 0.67619 cep -0.13858 0.18297 0.64492 bepc 0.10876 0.067579 0.3119 high_school_diploma 0.27404 0.37976 0.75817 university 0.23123 0.11859 0.31171 Results for dimension: 2 Scores CntrbPnt2In CntrbDim2In _________ ___________ ___________ unqualified 0.080727 0.10082 0.10058 cep -0.056047 0.080794 0.10549 bepc 0.028483 0.012512 0.021393 high_school_diploma 0.12134 0.20099 0.14865 university -0.31786 0.60488 0.589 ----------------------------------------------------------- Overview ROW POINTS Mass Score_1 Score_2 Inertia CntrbPnt2In_1 CntrbPnt2In_2 CntrbDim2In_1 CntrbDim2In_2 ________ _________ __________ __________ _____________ _____________ _____________ _____________ money 0.12123 -0.11527 -0.020046 0.0037595 0.045499 0.0037146 0.42845 0.012958 future 0.19975 0.17645 0.097863 0.0086904 0.17567 0.14587 0.71562 0.22013 unemployment 0.17776 -0.21223 0.070718 0.0091512 0.22616 0.067786 0.87492 0.097145 circumstances 0.013819 0.40092 -0.33099 0.0038038 0.062745 0.11544 0.58397 0.398 hard 0.01696 -0.24998 -0.06765 0.0011994 0.029938 0.0059184 0.88369 0.064717 economic 0.03392 0.35396 -0.32072 0.0087874 0.12005 0.26604 0.48362 0.39705 egoism 0.067211 0.059889 0.025667 0.0032871 0.0068096 0.0033763 0.073339 0.013471 employment 0.049623 -0.13675 -0.21539 0.0056484 0.026215 0.17555 0.1643 0.4076 finances 0.017588 -0.237 0.20598 0.0035763 0.027904 0.056902 0.27623 0.20867 war 0.016332 0.21682 0.074663 0.001025 0.021688 0.0069419 0.74907 0.088821 housing 0.032663 -0.006681 -0.12824 0.0020001 4.1183e-05 0.04096 0.00072894 0.26858 fear 0.099874 0.20335 0.058068 0.0045852 0.11666 0.025678 0.90069 0.073446 health 0.058417 0.11165 -0.0042912 0.00091131 0.020571 8.2025e-05 0.79911 0.0011804 work 0.094849 -0.21168 -0.10888 0.0056364 0.12005 0.085745 0.75402 0.19951 Overview COLUMN POINTS Mass Score_1 Score_2 Inertia CntrbPnt2In_1 CntrbPnt2In_2 CntrbDim2In_1 CntrbDim2In_2 ________ ________ _________ _________ _____________ _____________ _____________ _____________ unqualified 0.20289 -0.20932 0.080727 0.013146 0.2511 0.10082 0.67619 0.10058 cep 0.33731 -0.13858 -0.056047 0.010044 0.18297 0.080794 0.64492 0.10549 bepc 0.20226 0.10876 0.028483 0.0076704 0.067579 0.012512 0.3119 0.021393 high_school_diploma 0.17902 0.27404 0.12134 0.017732 0.37976 0.20099 0.75817 0.14865 university 0.078518 0.23123 -0.31786 0.013468 0.11859 0.60488 0.31171 0.589 ----------------------------------------------------------- Legend Row scores in principal coordinates Column scores in principal coordinates CntrbPnt2In = relative contribution of points to explain total Inertia of the latent dimension The sum of the numbers in a column is equal to 1 CntrbDim2In = relative contribution of latent dimension to explain total Inertia of a point CntrbDim2In_1+CntrbDim2In_2+...+CntrbDim2In_K=1
N=[80 20 90 90 5 100 40 50 40 40 70 10 100 40 10 70 20 90 80 99 40 0 80 2 20 95 20 40 35 52 38 47 48 80 40]; rl=["Dog" "Cat" "Rat" "Cockroach" "Wallaby"]; cl=["Big" "Athletic" "Friendly" "Trainable" "Resourceful" "Animal" "Lucky"]; Ntable=array2table(N,"RowNames",rl,"VariableNames",cl); out=CorAna(Ntable); % In the center of the map we have Wallaby and Lucky. Does this mean % wallabies are lucky animals? No. Wallaby is pretty average on all the % variables being measured. As it has nothing that differentiates it, the % result is that it is in the middle of the map (i.e., near the origin). % Similarly, Lucky does not differentiate, so it is also near the center. % That they are both in the center tells us that they are both indistinct, % and that is all that they have in common (in the data).
Summary Singular_value Inertia Accounted_for Cumulative ______________ _________ _____________ __________ dim_1 0.50576 0.2558 0.89448 0.89448 dim_2 0.14914 0.022243 0.077779 0.97226 dim_3 0.081626 0.0066627 0.023299 0.99556 dim_4 0.03564 0.0012702 0.0044417 1 ROW POINTS Results for dimension: 1 Scores CntrbPnt2In CntrbDim2In ________ ___________ ___________ Dog -0.59431 0.3295 0.94186 Cat -0.3256 0.081449 0.81272 Rat 0.27706 0.068913 0.57861 Cockroach 0.95997 0.51987 0.96141 Wallaby 0.019153 0.00027378 0.033895 Results for dimension: 2 Scores CntrbPnt2In CntrbDim2In _________ ___________ ___________ Dog -0.12157 0.15856 0.039411 Cat 0.079165 0.055371 0.048044 Rat 0.22533 0.52422 0.38272 Cockroach -0.18988 0.23391 0.037614 Wallaby -0.057062 0.027946 0.30085 COLUMN POINTS Results for dimension: 1 Scores CntrbPnt2In CntrbDim2In ________ ___________ ___________ Big -0.68224 0.17879 0.90397 Athletic 0.54545 0.1711 0.97126 Friendly -0.60693 0.15363 0.86806 Trainable -0.19488 0.026427 0.47685 Resourceful 0.89767 0.42097 0.98264 Animal -0.2172 0.041317 0.59767 Lucky 0.13298 0.0077629 0.55129 Results for dimension: 2 Scores CntrbPnt2In CntrbDim2In _________ ___________ ___________ Big -0.21116 0.19698 0.086599 Athletic -0.042213 0.011785 0.0058171 Friendly -0.20525 0.20206 0.099279 Trainable 0.17768 0.25264 0.3964 Resourceful -0.072334 0.031435 0.0063805 Animal 0.16308 0.26789 0.33696 Lucky -0.08585 0.03721 0.22978 ----------------------------------------------------------- Overview ROW POINTS Mass Score_1 Score_2 Inertia CntrbPnt2In_1 CntrbPnt2In_2 CntrbDim2In_1 CntrbDim2In_2 _______ ________ _________ _________ _____________ _____________ _____________ _____________ Dog 0.23863 -0.59431 -0.12157 0.089486 0.3295 0.15856 0.94186 0.039411 Cat 0.19652 -0.3256 0.079165 0.025635 0.081449 0.055371 0.81272 0.048044 Rat 0.22965 0.27706 0.22533 0.030465 0.068913 0.52422 0.57861 0.38272 Cockroach 0.1443 0.95997 -0.18988 0.13832 0.51987 0.23391 0.96141 0.037614 Wallaby 0.1909 0.019153 -0.057062 0.0020661 0.00027378 0.027946 0.033895 0.30085 Overview COLUMN POINTS Mass Score_1 Score_2 Inertia CntrbPnt2In_1 CntrbPnt2In_2 CntrbDim2In_1 CntrbDim2In_2 ________ ________ _________ _________ _____________ _____________ _____________ _____________ Big 0.098259 -0.68224 -0.21116 0.050593 0.17879 0.19698 0.90397 0.086599 Athletic 0.14711 0.54545 -0.042213 0.045062 0.1711 0.011785 0.97126 0.0058171 Friendly 0.10668 -0.60693 -0.20525 0.04527 0.15363 0.20206 0.86806 0.099279 Trainable 0.17799 -0.19488 0.17768 0.014176 0.026427 0.25264 0.47685 0.3964 Resourceful 0.13363 0.89767 -0.072334 0.10958 0.42097 0.031435 0.98264 0.0063805 Animal 0.22403 -0.2172 0.16308 0.017683 0.041317 0.26789 0.59767 0.33696 Lucky 0.1123 0.13298 -0.08585 0.0036019 0.0077629 0.03721 0.55129 0.22978 ----------------------------------------------------------- Legend Row scores in principal coordinates Column scores in principal coordinates CntrbPnt2In = relative contribution of points to explain total Inertia of the latent dimension The sum of the numbers in a column is equal to 1 CntrbDim2In = relative contribution of latent dimension to explain total Inertia of a point CntrbDim2In_1+CntrbDim2In_2+...+CntrbDim2In_K=1
N
— Contingency table (default) or n-by-2 input dataset.
2D Array or Table.2D array or table or timetable which contains the input contingency table (say of size I-by-J) or the original data matrix X.
In this last case N=crosstab(X(:,1),X(:,2)). As default procedure assumes that the input is a contingency table.
Data Types: single| double
Specify optional comma-separated pairs of Name,Value
arguments.
Name
is the argument name and Value
is the corresponding value. Name
must appear
inside single quotes (' '
).
You can specify several name and value pair arguments in any order as
Name1,Value1,...,NameN,ValueN
.
'k',3
, 'Lr',{'a' 'b' 'c'}
, 'Lc',{'c1' c2' 'c3' 'c4'}
, 'Sup',Sup=struct; Sup.c={'c2' 'c4'}
, 'datamatrix',true
, 'plots',1
, 'dispresults',false
, 'd1',2
, 'd2',3
k
—Number of dimensions to retain.scalar.Scalar which contains the number of dimensions to retain.
The default value of k is 2.
Example: 'k',3
Data Types: double
Lr
—Vector of row labels.cell.Cell containing the labels of the rows of the input contingency matrix N. This option is unnecessary if N is a table, because in this case Lr=N.Properties.RowNames;
Example: 'Lr',{'a' 'b' 'c'}
Data Types: cell array of strings
Lc
—Vector of column labels.cell.Cell containing the labels of the columns of the input contingency matrix N. This option is unnecessary if N is a table, because in this case Lc=N.Properties.VariableNames;
Example: 'Lc',{'c1' c2' 'c3' 'c4'}
Data Types: cell array of strings
Sup
—Structure containing indexes or names of supplementary rows
or columns.structure.Structure with the following fields.
Value | Description |
---|---|
r |
vector containing row indexes or vector of cell array of strings or table or 2D numeric array, containing supplementary rows. If indexes or cell array of strings are supplied in a vector, we assume that supplementary rows belong to contingency table N. For example: - if Sup.r=[2 5] (that is Sup.r is a numeric vector which contains row indexes) we use rows 2 and 5 of the input contingency table as supplementary rows. - if Sup.r={'Junior-Managers' 'Senior-Employees'} (that is Sup.r is a cell array of strings) we use rows named 'Junior-Managers' and 'Senior-Employees' of the input contingency table as supplementary rows. Of course the length of Sup.r must be smaller than the number of rows of the contingency matrix divided by 2. - if Sup.r is a table, or a 2D array supplementary rows do not belong to N. Note that if Sup.r is a table, the labels of the rows are taken directly from the table. If on the other hand Sup.r is a matrix the names of the rows of the supplementary units can be given using Sup.Lr as a cell array of strings. |
Lr |
cell array of strings containing the labels of the supplementary units if Sup.r is a 2D numeric array. |
c |
vector containing column indexes or vector of cell array of strings or table or 2D numeric array use as supplementary columns, or table or 2D numeric array containing supplementary rows. If indexes or cell array of strings are supplied in a vector, we assume that supplementary columns belong to contingency table N. For example: - if Sup.c=[2 3] (that is Sup.c is a numeric vector which contains column indexes) we use columns 2 and 3 of the input contingency table as supplementary columns. - if Sup.c={'Smokers' 'NonSmokers'} (that is Sup.c is a cell array of strings) we use columns of the contingency table labeled 'Smokers' and 'NonSmokers' of the input contingency table N as supplementary columns. Of course the length of Sup.c must be smaller than the number of columns of the contingency matrix divided by 2. - If Sup.c is a table, or a 2D array supplementary columns do not belong to N. Note that if Sup.c is a table, the labels of the columns are taken directly from the table. If on the other hand Sup.c is a matrix the names of the columns of the supplementary units can be given using Sup.Lc as a cell array of strings. |
Lc |
cell array of strings containing the labels of the supplementary units if Sup.r is a 2D numeric array. REMARK: The default value of Sup is a missing value that is we assume that there are no supplementary rows or columns. |
Example: 'Sup',Sup=struct; Sup.c={'c2' 'c4'}
Data Types: struct
datamatrix
—Data matrix or contingency table.boolean.If datamatrix is true the first input argument N is forced to be interpreted as a data matrix, else if the input argument is false N is treated as a contingency table. The default value of datamatrix is false, that is the procedure automatically considers N as a contingency table (in array or table format). If datamatrix is true, N can be an array or a table of size n-by-2. Note that if N has more than two columns correspondence analysis is based on the first two columns of N (and a warning is produced).
Example: 'datamatrix',true
Data Types: logical
plots
—Plot on the screen.scalar | structure.If plots = 1, a plot which shows the Principal coordinates of rows and columns is shown on the screen. If plots is a structure it may contain the following fields:
Value | Description |
---|---|
alpha |
type of plot, scalar in the interval [0 1] or a string identifying the type of coordinates to use in the plot. If $plots.alpha='rowprincipal'$ the row points are in principal coordinates and the column coordinates are standard coordinates. Distances between row points are (approximated) chi-squared distances (row-metric-preserving). The position of the row points are at the weighted average of the column points. Note that 'rowprincipal' can also be specified setting plots.alpha=1. If $plots.alpha='colprincipal'$, the column coordinates are referred to as principal coordinates and the row coordinates as standard coordinates. Distances between column points are (approximated) chi-squared distances (column-metric-preserving). The position of the column points are at the weighted average of the row points. Note that 'colprincipal' can also be specified setting plots.alpha=0. If $plots.alpha='symbiplot'$, the row and column coordinates are scaled similarly. The sum of weighted squared coordinates for each dimension is equal to the corresponding singular values. These coordinates are often called symmetrical coordinates. This representation is particularly useful if one is primarily interested in the relationships between categories of row and column variables rather than in the distances among rows or among columns. 'symbiplot' can also be specified setting plots.alpha=0.5; If $plots.alpha='bothprincipal'$, both the rows and columns are depicted in principal coordinates. Such a plot is often referred to as a symmetrical plot or French symmetrical model. Note that such a symmetrical plot does not provide a feasible solution in the sense that it does not approximate matrix $D_r^{-0.5}(P-rc')D_c^{-0.5}$. |
FontSize |
scalar which specifies the font size of row (column) labels. The default value is 10. |
MarkerSize |
scalar which specifies the marker size of symbols associated with rows or columns. The default value is 10. |
Example: 'plots',1
Data Types: scalar double | struct
dispresults
—Display results on the screen.boolean.If dispresults is true (default) it is possible to see on the screen all the summary results of the analysis.
Example: 'dispresults',false
Data Types: Boolean
d1
—Dimension to show on the horizontal axis.positive integer.Positive integer in the range 1, 2, .., K which indicates the dimension to show on the x axis. The default value of d1 is 1.
Example: 'd1',2
Data Types: single | double
d2
—Dimension to show on the vertical axis.positive integer.Positive integer in the range 1, 2, .., K which indicates the dimension to show on the y axis. The default value of d2 is 2.
Example: 'd2',3
Data Types: single | double
out
— description
StructureA structure containing the following fields
Value | Description |
---|---|
Lr |
cell of length $I$ containing the labels of active rows (i.e. the rows which participated to the fit). |
Lc |
cell of length $J$ containing the labels of active columns (i.e. the columns which participated to the fit). |
N |
$I$-by-$J$-array containing contingency table referred to active rows and active columns (i.e. referred to the rows/columns which participated to the fit). The $(i,j)$-th element is equal to $n_{ij}$, $i=1, 2, \ldots, I$ and $j=1, 2, \ldots, J$. The sum of the elements of out.P is $n$ (the grand total). |
Ntable |
Same as out.N but in table format (with row and column names). This output is present just if your MATLAB version is not<2013b. |
I |
Number of active rows of contingency table. |
J |
Number of active columns of contingency table. |
n |
Grand total. out.n is equal to sum(sum(out.N)). This is the number of observations. |
Nhat |
$I$-by-$J$-array containing contingency table referred to active rows (i.e. referred to the rows which participated to the fit) under the independence hypothesis. The $(i,j)$-th element is equal to $n_{i.}n_{.j}/n$, $i=1, 2, \ldots, I$ and $j=1, 2, \ldots, J$. The sum of the elements of out.Nhat is $n$ (the grand total). |
Nhattable |
Same as out.Nhat but in table format (with row and column names). |
P |
$I$-by-$J$-array containing correspondence matrix (proportions). The $(i,j)$-th element is equal to $n_{ij}/n$, $i=1, 2, \ldots, I$ and $j=1, 2, \ldots, J$. The sum of the elements of out.P is 1. |
Ptable |
Same as out.P but in table format (with row and column names). This output is present just if your MATLAB version is not<2013b. |
r |
Vector of length $I$ containing row masses. \[ r=(f_{1.}, f_{2.}, \ldots, f_{I.})' \] $r$ is also the centroid of column profiles. |
Dr |
Square matrix of size $I$ containing on the diagonal the row masses. This is matrix $D_r$. \[ D_r=diag(r) \] |
c |
Vector of length $J$ containing column masses. \[ c=(f_{.1}, f_{.2}, \ldots, f_{.J})' \] $c$ is also the centroid of row profiles. |
Dc |
Square matrix of size $J$ containing on the diagonal the column masses. This is matrix $D_c$. \[ D_c=diag(c) \] |
ProfilesRows |
$I$-by-$J$-matrix containing row profiles. The $i,j$-th element of this matrix is given by $f_{ij}/f_{i.}=n_{ij}/n_{i.}$. Written in matrix form: \[ ProfilesRows = D_r^{-1} \times P \] |
ProfilesCols |
$I$-by-$J$-matrix containing column profiles. The $i,j$-th element of this matrix is given by $f_{ij}/f_{.j}=n_{ij}/n_{.j}$. Written in matrix form: \[ ProfilesCols = P \times D_c^{-1} \] |
K |
Scalar integer containing the maximum number of dimensions. $K = \min(I-1,J-1)$. |
k |
Scalar integer containing the number of retained dimensions. |
Residuals |
$I$-by-$J$-matrix containing standardized residuals. \[ Residuals = D_r^{1/2} (D_r^{-1} P - r c') D_c^{-1/2} = D_r^{-1/2} (P - r c') D_c^{-1/2} \] With the singular value decomposition (SVD) we obtain that: \[ Residuals = U \Gamma V' \] |
TotalInertia |
Scalar containing total inertia. Total inertia is equal (for example) to the sum of the squares of the elements of matrix out.Residuals. |
Chi2stat |
Scalar containing Chi-square statistic for the contingency table. $Chi2stat= TotalInertia \times n$. |
CramerV |
Scalar containing Cramer's $V$ index. \[ V=\sqrt{Chi2stat/(n (\min(I,J)-1))} \] Cramer's index goes between 0 and 1. |
InertiaExplained |
matrix with 4 columns. - First column contains the singular values (the sum of the squared singular values is the total inertia). - Second column contains the eigenvalues (the sum of the eigenvalues is the total inertia). - Third column contains the variance explained by each latent dimension. - Fourth column contains the cumulative variance explained by each dimension. |
RowsPri |
$I$-by-$K$ matrix containing principal coordinates of rows. \[ RowsPri = D_r^{-1/2} \times U \times \Gamma; \] |
ColsPri |
$J$-by-$K$ matrix containing Principal coordinates of columns. \[ ColsPri = D_c^{-1/2} \times V \times \Gamma; \] |
RowsSta |
$I$-by-$K$ matrix containing standard coordinates of rows. \[ RowsSta = RowsPri \times \Gamma^{-1} = D_r^{-1/2} U \Gamma \Gamma^{-1}= D_r^{-1/2} U \] |
ColsSta |
$J$-by-$K$ matrix containing standard coordinates of columns. \[ ColsSta = ColsPri \times \Gamma^{-1} = D_c^{-1/2} V \Gamma \Gamma^{-1}= D_c^{-1/2} V \] |
RowsSym |
$I$-by-$K$ matrix containing symmetrical coordinates of rows. \[ RowsSym = D_r^{-1/2} \times U \times \Gamma^{1/2} \] |
ColsSym |
$J$-by-$K$ matrix containing symmetrical coordinates of columns. \[ ColsSym = D_c^{-1/2} \times V \times \Gamma^{1/2} \] Symmetric plot represents the row and column profiles simultaneously in a common space (Bendixen, 2003). In this case, only the distance between row points or the distance between column points can be really interpreted. The distance between any row and column items is not meaningful! You can only make a general statements about the observed pattern. In order to interpret the distance between column and row points, the column profiles must be presented in row space or vice-versa. This type of map is called asymmetric biplot. |
InertiaRows |
$I$-by-$2$ matrix containing absolute and relative contribution of each row to total inertia. The inertia of a point is the squared distance of point $d_i^2$ to the centroid. The absolute contribution of a point to total inertia is the inertia of the point multiplied by the point mass. 1st column = absolute contribution of each row to TotalInertia. The sum of values of the first column is equal to TotalInertia; 2nd column = relative contribution of each row to TotalInertia. The sum of the values of the second column is equal to 1. |
InertiaCols |
$J$-by-$2$ matrix containing absolute and relative contribution of each column to total inertia. The inertia of a point is the squared distance of point $d_i^2$ to the centroid. The absolute contribution of a point to total inertia is the inertia of the point multiplied by the point mass. 1st column = absolute contribution of each column to TotalInertia. The sum of values of the first column is equal to TotalInertia; 2nd column = relative contribution of each column to TotalInertia. The sum of values of the second column is equal to 1. |
Point2InertiaRows |
$I$-by-$K$ matrix containing relative contributions of rows to inertia of the dimension. The inertia of first latent dimension is given by $\lambda_1=\gamma_{11}^2$. The inertia of second latent dimension is given by $\lambda_2=\gamma_{22}^2$ .... The sum of each column of matrix Point2InertiaRows is equal to 1. Remark: the points with the larger value of Point2Inertia are those which contribute the most to the definition of the dimension. If the row contributions were uniform, the expected value would be 1/size(contingeny_table,1) For a given dimension, any row with a contribution larger than this threshold could be considered as important in contributing to that dimension. |
Point2InertiaCols |
$J$-by-$K$ matrix containing relative contributions of columns to inertia of the dimension. The sum of each column of matrix Point2InertiaCols is equal to 1. |
Dim2InertiaRows |
$I$-by-$K$ matrix containing relative contributions of latent dimensions to inertia of the row points. These numbers can be interpreted as squared correlations and measures the degree of association between row points and a particular axis. The sum of each row of matrix Dim2InertiaRows is equal to 1. |
Dim2InertiaCols |
$J$-by-$K$ matrix containing relative contributions of latent dimensions to inertia of the column points. These numbers can be interpreted as squared correlations and measure the degree of association between columns points and a particular axis. The sum of each row of matrix Dim2InertiaCols is equal to 1. |
cumsumDim2InertiaRows |
$I$-by-$K$ matrix containing cumulative sum of the contributions of latent dimensions to inertia of the row points. These cumulative sums are equivalent to the communalities in PCA. The last column of matrix cumsumDim2InertiaRows is equal to 1. |
cumsumDim2InertiaCols |
$J$-by-$K$ matrix containing cumulative sum of the contributions of latent dimensions to inertia of the column points. These cumulative sums are equivalent to the communalities in PCA. The last column of matrix cumsumDim2InertiaCols is equal to 1. |
sqrtDim2InertiaRows |
$I$-by-$K$ matrix containing correlation of rows points with latent dimension axes. Similar to component loadings in PCA |
sqrtDim2InertiaCols |
$I$-by-$K$ matrix containing correlation of column points with latent dimension axes. Similar to component loadings in PCA. |
Summary |
$K$-times-4 table containing summary results for correspondence analysis. First column contains the singular values (the sum of the squared singular values is the total inertia). Second column contains the eigenvalues (the sum of the eigenvalues is the total inertia). Third column contains the variance explained by each latent dimension. Fourth column contains the cumulative variance explained by each dimension. This output is present just if your MATLAB version is not<2013b. |
OverviewRows |
$I$-times-(k*3+2) table containing an overview of row points. More precisely, if we suppose that $k=2$, First column contains the row masses (vector $r$). Second column contains the scores of first dimension. Third column contains the scores of second dimension. Fourth column contains the inertia of each point, where inertia of point is the squared distance of point $d_i^2$ to the centroid. Fifth column contains the relative contribution of each point to the explanation of the inertia of the first dimension. The sum of the elements of this column is equal to 1. Sixth column contains the relative contribution of each point to the explanation of the inertia of the second dimension. The sum of the elements of this column is equal to 1. Seventh column contains the relative contribution of the first dimension to the explanation of the inertia of the point. Eight column contains the relative contribution of the second dimension to the explanation of the inertia of the point. |
OverviewCols |
$J$-times-(k*3+2) table containing an overview of row points. More precisely if we suppose that $k=2$ First column contains the column masses (vector $c$). Second column contains the scores of first dimension. Third column contains the scores of second dimension. Fourth column contains the inertia of each point, where inertia of point is the squared distance of point $d_i^2$ to the centroid. Fifth column contains the relative contribution of each point to the explanation of the inertia of the first dimension. The sum of the elements of this column is equal to 1. Sixth column contains the relative contribution of each point to the explanation of the inertia of the second dimension. The sum of the elements of this column is equal to 1. Seventh column contains the relative contribution of the first dimension to the explanation of the inertia of the point. Eight column contains the relative contribution of the second dimension to the explanation of the inertia of the point. |
LrSup |
cell containing the labels of the supplementary rows (i.e. the rows whicg did not participate to the fit). |
LcSup |
cell containing the labels of supplementary columns (i.e. the columns which did not participate to the fit). |
SupRowsN |
matrix of size length(LrSup)-by-c referred to supplementary rows. If there are no supplementary rows this field is not present. |
SupRowsNtable |
Same as out.SupRowsN but in table format (with row and column names). This is the contingency table referred to supplementary rows. If there are no supplementary rows this field is not present. This output is present just if your MATLAB version is not<2013b. |
SupColsN |
matlab of size r-by-length(LcSup) referred to supplementary columns. If there are no supplementary columns this field is not present. |
SupColsNtable |
Same as out.SupColsN but in table format (with row and column names). This is the contingency table referred to supplementary columns. If there are no supplementary columns this field is not present. This output is present just if your MATLAB version is not<2013b. |
RowsPriSup |
Principal coordinates of supplementary rows. If there are no supplementary rows this field is not present. |
RowsStaSup |
Standard coordinates of supplementary rows. If there are no supplementary rows this field is not present. |
RowsSymSup |
Symmetrical coordinates of supplementary rows. If there are no supplementary rows this field is not present. |
ColsPriSup |
Principal coordinates of supplementary columns. If there are no supplementary columns this field is not present. |
ColsStaSup |
Standard coordinates of of supplementary columns. If there are no supplementary columns this field is not present. |
ColsSymSup |
Symmetrical coordinates of supplementary columns. If there are no supplementary columns this field is not present. |
Benzecri, J.-P. (1992), "Correspondence Analysis Handbook", New-York, Dekker.
Benzecri, J.-P. (1980), "L'analyse des donnees tome 2: l'analyse des correspondances", Paris, Bordas.
Greenacre, M.J. (1993), "Correspondence Analysis in Practice", London, Academic Press.
Gabriel, K.R. and Odoroff, C. (1990), Biplots in biomedical research, "Statistics in Medicine", Vol. 9, pp. 469-485.
Greenacre, M.J. (1993), Biplots in correspondence Analysis, "Journal of Applied Statistics", Vol. 20, pp. 251-269.
Riani, M, Atkinson A.C., Torti, F., Corbellini A. (2023), Robust Correspondence Analysis, "Journal of the Royal Statistical Society Series C: Applied Statistics", Vol. 71, pp. 1381–1401, https://doi.org/10.1111/rssc.12580
This function has been inspired by the code developed by: Urbano Lorenzo-Seva (Rovira i Virgili University, Tarragona, Spain), Michel van de Velden (Erasmus University, Rotterdam, The Netherlands), and Henk A.L. Kiers (University of Groningen, Groningen, The Netherlands) (See References).
crosstab
|
rcontFS
|
CressieRead
|
CorAnaplot