RandIndexFS calculates Rand type Indices to compare two partitions
Suppose we want to compare two partitions summarized by the contingency table $T=[n_{ij}]$ where $i=1, 2, ..., r$ and $j=1,...,c$ and $n_{ij}$ denotes the number of data points which are in cluster i in the first partition and in cluster j in the second partition. Let A denote the number of all pairs of data points which are either put into the same cluster by both partitions or put into different clusters by both partitions. Conversely, let D denote the number of all pairs of data points that are put into one cluster in one partition, but into different clusters by the other partition. The partitions disagree for all pairs D and agree for all pairs A. A+D=totcomp= total number of comparisons.
We can measure the agreement by the Rand index A/(A+D)=A/(totcomp) which is invariant with respect to permutations of the columns or rows of T.
The index has to be corrected for agreement by chance if the sizes of the clusters are not uniform (which is usually the case). Since the Rand index lies between 0 and 1, the expected value of the Rand index (although not a constant value) must be greater than or equal to 0. On the other hand, the expected value of the adjusted Rand index has value zero and the maximum value of the adjusted Rand index is also 1. Hence, there is a wider range of values that the adjusted Rand index can take on, thus increasing the sensitivity of the index. The formula of the adjusted Rand index (AR) is given below
\[ AR= \frac{\mbox{RI- Expected value of RI}}{\mbox{Max Index - Expected value of RI}} \]
RandindexFS with the two vectors as input.AR
=RandIndexFS(c1
,
c2
,
noisecluster
)
Hubert L. and Arabie P. (1985), Comparing Partitions, "Journal of Classification", Vol. 2, pp. 193-218.
This function follows the lines of MATLAB code developed by David Corney (2000) D.Corney@cs.ucl.ac.uk