An Active Co-Training Algorithm for Biomedical Named-Entity Recognition

Tsendsuren Munkhdalai, Meijing Li, Unil Yun, Oyun-Erdene Namsrai and Keun Ho Ryu
Volume: 8, No: 4, Page: 575 ~ 588, Year: 2012
10.3745/JIPS.2012.8.4.575
Keywords: Biomedical Named-Entity Recognition, Co-Training, Semi-Supervised Learning, Feature Processing, Text Mining
Full Text:

Abstract
Exploiting unlabeled text data with a relatively small labeled corpus has been an active and challenging research topic in text mining, due to the recent growth of the amount of biomedical literature. Biomedical named-entity recognition is an essential prerequisite task before effective text mining of biomedical literature can begin. This paper proposes an Active Co-Training (ACT) algorithm for biomedical named-entity recognition. ACT is a semi-supervised learning method in which two classifiers based on two different feature sets iteratively learn from informative examples that have been queried from the unlabeled data. We design a new classification problem to measure the informativeness of an example in unlabeled data. In this classification problem, the examples are classified based on a joint view of a feature set to be informative/non-informative to both classifiers. To form the training data for the classification problem, we adopt a query-bycommittee method. Therefore, in the ACT, both classifiers are considered to be one committee, which is used on the labeled data to give the informativeness label to each example. The ACT method outperforms the traditional co-training algorithm in terms of fmeasure as well as the number of training iterations performed to build a good classification model. The proposed method tends to efficiently exploit a large amount of unlabeled data by selecting a small number of examples having not only useful information but also a comprehensive pattern.

Article Statistics
Multiple requests among the same broswer session are counted as one view (or download).
If you mouse over a chart, a box will show the data point's value.


Cite this article
IEEE Style
Tsendsuren Munkhdalai, Meijing Li, Unil Yun, Oyun-Erdene Namsrai and Keun Ho Ryu, "An Active Co-Training Algorithm for Biomedical Named-Entity Recognition," Journal of Information Processing Systems, vol. 8, no. 4, pp. 575~588, 2012. DOI: 10.3745/JIPS.2012.8.4.575.

ACM Style
Tsendsuren Munkhdalai, Meijing Li, Unil Yun, Oyun-Erdene Namsrai and Keun Ho Ryu, "An Active Co-Training Algorithm for Biomedical Named-Entity Recognition," Journal of Information Processing Systems, 8, 4, (2012), 575~588. DOI: 10.3745/JIPS.2012.8.4.575.