One-class classification

In machine learning, one-class classification, also known as unary classification, tries to identify objects of a specific class amongst all objects, by learning from a training set containing only the objects of that class. This is different from and more difficult than the traditional classification problem, which tries to distinguish between two or more classes with the training set containing objects from all the classes. An example is the classification of the operational status of a nuclear plant as 'normal':[1] In this scenario, there are (fortunately) few or no examples of catastrophic system states, only the statistics of normal operation are known. The term One-class classification was coined by Moya & Hush (1996)[2] and many applications can be found in scientific literature, for example outlier detection, anomaly detection, novelty detection.

A similar problem is PU learning, in which a binary classifier is learned in a semi-supervised way from only positive and unlabeled samples.[3]

PU learning

In PU learning, two sets of samples are assumed to be available for training: the positive set and a mixed set , which is assumed to contain both positive and negative samples, but without these being labeled as such. This contrasts with other forms of semisupervised learning, where it is assumed that a labeled set containing examples of both classes is available in addition to unlabeled samples. A variety of techniques exist to adapt supervised classifiers to the PU learning setting, including variants of the EM algorithm. PU learning has been successfully applied to text,[4][5][6] time series,[7] and bioinformatics tasks.[8]

See also

References

  1. Tax, D. (2001) One-class classification: Concept-learning in the absence of counter-examples. Doctoral Dissertation, University of Delft, The Netherlands.
  2. Moya, M. and Hush, D. (1996). "Network constraints and multi- objective optimization for one-class classification". Neural Networks, 9(3):463–474. doi:10.1016/0893-6080(95)00120-4
  3. Liu, Bing (2007). Web Data Mining. Springer. pp. 165−178.
  4. Bing Liu, Wee Sun Lee, Philip S. Yu and Xiao-Li Li (2002). Partially supervised classification of text documents. ICML. pp. 8–12.
  5. Hwanjo Yu, Jiawei Han, Kevin Chen-Chuan Chang (2002). PEBL: positive example based learning for web page classification using SVM. ACM SIGKDD.
  6. Xiao-Li Li and Bing Liu (2003). Learning to classify text using positive and unlabeled data. IJCAI.
  7. Minh Nhut Nguyen, Xiao-Li Li, and See-Kiong Ng (2011). Positive Unlabeled Learning for Time Series Classification. IJCAI.
  8. Peng Yang, Xiao-Li Li, Jian-Ping Mei, Chee-Keong Kwoh and See-Kiong Ng (2012). Positive-Unlabeled Learning for Disease Gene Identification. Bioinformatics, Vol 28(20).


This article is issued from Wikipedia - version of the 4/16/2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.