Semi-Supervised Classification with Hybrid Generative/Discriminative Methods

Publication Date

2007

Journal or Book Title

KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING

Abstract

We compare two recently proposed frameworks for combining generative and discriminative probabilistic classifiers and apply them to semi-supervised classification. In both cases we explore the tradeoff between maximizing a discriminative likelihood of labeled data and a generative likelihood of labeled and unlabeled data. While prominent semi-supervised learning methods assume low density regions between classes or are subject to generative modeling assumptions, we conjecture that hybrid generative/discriminative methods allow semi-supervised learning in the presence of strongly overlapping classes and reduce the risk of modeling structure in the unlabeled data that is irrelevant for the specific classification task of interest. We apply both hybrid approaches within naively structured Markov random field models and provide a thorough empirical comparison with two well-known semi-supervised learning methods on six text classification tasks. A semi-supervised hybrid generative/discriminative method provides the best accuracy in 75% of the experiments, and the multi-conditional learning hybrid approach achieves the highest overall mean accuracy across all tasks.

DOI

https://doi.org/10.1145/1281192.1281225

Pages

280-289

This document is currently not available here.

Share

COinS