restrdeter computes determinant restriction
restrdeter restricts the determinant according to the constraint specified in scalar restr. This function is called in every concentration step of function tclust in case determinant restriction is needed
Example using all default options.out
=restrdeter(eigenvalues
,
niini
,
restr
)
Determinant restriction when an eigenvalue is 0.out
=restrdeter(eigenvalues
,
niini
,
restr
,
tol
)
Suppose v=3 and k=4 so the matrix containing the eigenvalues is 3-by-4 First column of matrix eigenvalues contains the eigenvalues of the first group Second column of matrix eigenvalues contains the eigenvalues of the second group Thrid column of matrix eigenvalues contains the eigenvalues of the third group Fourth column of matrix eigenvalues contains the eigenvalues of the fourth group
rng(10,'twister') eigenvalues=abs(10*randn(3,4)); % niini is the column vector containing the sizes of the 4 groups niini=[30;40;20;10]; out=restrdeter(eigenvalues,niini,1.1) disp('Input matrix of unrestricted eigenvalues') disp(eigenvalues) disp('Output matrix of restricted eigenvalues which satisfy determinant constraint') disp(out) disp('Ratio between largest and smallest determinant') disp(max(prod(eigenvalues))/min(prod(eigenvalues))) disp('Ratio between largest and smallest restricted determinants') disp(max(prod(out))/min(prod(out)))
out = 4.7173 50.0736 7.6327 3.7940 10.7021 0.2257 3.8281 3.6133 2.8139 11.4263 4.8620 10.3628 Input matrix of unrestricted eigenvalues 6.4581 11.1831 11.6942 7.3428 14.6513 0.0504 5.8652 6.9930 3.8523 2.5519 7.4491 20.0559 Output matrix of restricted eigenvalues which satisfy determinant constraint 4.7173 50.0736 7.6327 3.7940 10.7021 0.2257 3.8281 3.6133 2.8139 11.4263 4.8620 10.3628 Ratio between largest and smallest determinant 715.8613 Ratio between largest and smallest restricted determinants 1.1000
Suppose 5 variables and six groups
av=abs(randn(5,6)); % The third eigenvalue of the second groups is set to 0 av(3,2)=0; % Maximum ratio among determinants must be equal to 1.6. restr=1.6; % Group sizes niini=[30;40;20;10;50;100]; disp('Original values of the determinants') disp(prod(av)) % Apply the restriction a=restrdeter(av,niini,restr); disp('Restricted eigenvalues which satisfy determinant constraint') disp(a) disp('Values of restricted determinants') disp(prod(a)) disp('Maximum value of ratio among determinants') disp(max(prod(a))/min(prod(a)))
Original values of the determinants 0.0055 0 0.0010 0.0000 0.1825 0.0432 Restricted eigenvalues which satisfy determinant constraint 1.3868 53.0992 0.6626 0.0566 0.6163 1.2339 0.0670 61.2269 0.5191 0.6177 0.3601 1.0498 0.6905 0.0000 0.4029 1.3884 0.6211 0.1682 0.1217 7.6070 0.1578 0.7623 0.3348 0.1529 1.1867 61.2269 0.4240 0.2507 0.3214 0.4454 Values of restricted determinants 0.0093 0.0093 0.0093 0.0093 0.0148 0.0148 Maximum value of ratio among determinants 1.6000
Suppose 3 variables and six groups
av=abs(randn(3,6)); % Maximum ratio among determinants must be equal to 1.6. restr=1.6; % Group sizes niini=[30;40;20;10;50;100]; % Apply the restriction using a tolerance of 1e-12 and use MATLAB % function repmat for the computations tol=1e-12; repm=1; a=restrdeter(av,niini,restr,tol,repm);
Two variables and five groups.
av=abs(randn(2,5)); restr=1.6; niini=[30;40;20;10;50]; av(:,2)=0; a=restrdeter(av,niini,restr); disp('Maximum value of ratio among determinants') disp(max(prod(a))/min(prod(a)))
niini=[30;40;20;10;50]; av=abs(randn(2,5)); av(:,2:3)=0; restr=1.6; a=restrdeter(av,niini,restr); disp('Maximum value of ratio among determinants') disp(max(prod(a))/min(prod(a)))
eigenvalues
— Eigenvalues.
Matrix.v x k matrix containing the eigenvalues of the covariance matrices of the k groups.
v is the number of variables of the dataset which has to be clustered.
Data Types: single| double
niini
— Cluster size.
Column vector.k x 1 vector containing the size of the k clusters
Data Types: single| double
restr
— Restriction factor.
Scalar (default) or vector of length 2.If restr is a scalar it defines the maximum ratio of the determinants which is allowed. In other words, we impose the constraint on the covariance matrices: \[ \frac{\max_{j=1,...,k} |\Sigma_j|}{\min_{j=1,...,k} |\Sigma_j|} \leq restr \] where $restr \geq 1$. In this case the "shape" constraint (as defined below) applied to each group is fixed to $c_{shape}=10^{10}$, to ensure the procedure is (virtually) affine equivariant. In other words, the decomposition or the $j$-th scatter matrix $\Sigma_j$ is \[ \Sigma_j=\lambda_j^{1/v} \Omega_j \Gamma_j \Omega_j' \] where $\Omega_j$ is an orthogonal matrix of eigenvectors, $\Gamma_j$ is a diagonal matrix with $|\Gamma_j|=1$ and with elements $\{\gamma_{j1},...,\gamma_{jv}\}$ in its diagonal (proportional to the eigenvalues of the $\Sigma_j$ matrix) and $|\Sigma_j|=\lambda_j$. The $\Gamma_j$ matrices are commonly known as "shape" matrices, because they determine the shape of the fitted cluster components. The following $k$ constraints are then imposed on the shape matrices: \[ \frac{\max_{l=1,...,v} \gamma_{jl}}{\min_{l=1,...,v} \gamma_{jl}}\leq c_{shape}, \text{ for } j=1,...,k, \]
The particular case $restr=1$ forces all determinants of the scatter matrices to be equal i.e. $|\Sigma_1|=...= |\Sigma_k|$.
If $restr$ is a vector of length 2 the second element refers to $c_{shape}$ of the previous equation. In other words, for example if $restr=[3, 10]$ we impose the $k+1$ constraints
\[ \frac{\max_{j=1,...,k} |\Sigma_j|}{\min_{j=1,...,k} |\Sigma_j|} \leq restr(1)=3 \] and \[ \frac{\max_{l=1,...,v} \gamma_{jl}}{\min_{l=1,...,v} \gamma_{jl}} \leq restr(2)=10, \text{ for } j=1,...,k, \]Different constrained clustering problems can be defined when varying $restr(1)$ and $restr(2)$. In particular, we are ideally searching for spherical clusters when $restr(2)=1$.
Models with variable volume and spherical clusters are handled with $1<restr(1)<\infty$ and $restr(2)=1$. The $restr(1)=restr(2)=1$ case often yields a very constrained parametrization because it implies spherical clusters with equal volumes.
Data Types: single| double
tol
— tolerance.
Scalar defining the tolerance of the procedure.The default value is 1e-8
Example: 'tol',[1e-18]
Data Types: double
userepmat
— use builtin repmat.
Scalar.If userepmat is true function repmat is used instead of bsxfun inside the procedure.
Remark: repmat is built in from MATLAB 2013b so it is faster to use repmat if the current version of MATLAB is >2013a
Example: 'userepmat',1
Data Types: double
out
—Restricted eigenvalues which satisfy the
determinant constraint.
Matrixv-by-k matrix containing restricted eigenvalues.
The ratio between the determinants (that is the product of the columns of matrix out) is not greater than restr
Fritz H., Garcia-Escudero, L.A. and Mayo-Iscar, A. (2013), A fast algorithm for robust constrained clustering, "Computational Satistics and Data Analysis", Vol. 61, pp. 124-136.