Semisupervised Classfication With Cluster Regularization


The Problem

Semisupervised classification (SSC) learns, from cheap unlabeled data and labeled data, to predict the labels of test instances. In order to make use of the information from unlabeled data, there should be an assumed relationship between the true class structure and the data distribution. One assumption is that data points clustered together are likely to have the same class label. In this paper, we propose a new algorithm, namely, cluster-based regularization (ClusterReg) for SSC, that takes the partition given by a clustering algorithm as a regularization term in the loss function of an SSC classifier. ClusterReg makes predictions according to the cluster structure together with limited labeled data.

 

Matlab toolbox

In this Matlab toolbox, we implement the ClusterReg algorithm, which is a multi-class semi-supervised classifier based on cluster regularization. This implementation has been tested under linux x86_64 platform. The mex files in self-tuned spectral clustering must be recompiled for specific platforms. The other clustering algorithms are mex-free. The experiments confirmed that ClusterReg has good generalization ability for real-world problems. Its performance is excellent when data follows this cluster assumption. Even when these clusters have misleading overlaps, it still outperforms other state-of-the-art.

 

Codes are available for download

The Matlab Toolbox is released under an open-source license, and is available at the following link

Download the Matlab Package

 

References


Rodrigo G. F. Soares, Huanhuan Chen and Xin Yao. Semi-supervised Classification with Cluster Regularisation. IEEE Transactions on Neural Networks and Learning Systems. vol.23, no.11, pp.1779-1792, November 2012.

 

 

 

All Matlab codes on this page are published under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.

Creative Commons License