Impact of Instance Selection on kNN-Based Text Categorization


Fatiha Barigou, Journal of Information Processing Systems
Vol. 14, No. 2, pp. 418-434, Apr. 2018
10.3745/JIPS.02.0080
Keywords: Classification Accuracy, Classification Efficiency, Data Reduction, Instance Selection, k-Nearest Neighbors, text categorization
Fulltext:

Abstract

With the increasing use of the Internet and electronic documents, automatic text categorization becomes imperative. Several machine learning algorithms have been proposed for text categorization. The k-nearest neighbor algorithm (kNN) is known to be one of the best state of the art classifiers when used for text categorization. However, kNN suffers from limitations such as high computation when classifying new instances. Instance selection techniques have emerged as highly competitive methods to improve kNN through data reduction. However previous works have evaluated those approaches only on structured datasets. In addition, their performance has not been examined over the text categorization domain where the dimensionality and size of the dataset is very high. Motivated by these observations, this paper investigates and analyzes the impact of instance selection on kNN-based text categorization in terms of various aspects such as classification accuracy, classification efficiency, and data reduction.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.




Cite this article
[APA Style]
Fatiha Barigou (2018). Impact of Instance Selection on kNN-Based Text Categorization. Journal of Information Processing Systems, 14(2), 418-434. DOI: 10.3745/JIPS.02.0080.

[IEEE Style]
F. Barigou, "Impact of Instance Selection on kNN-Based Text Categorization," Journal of Information Processing Systems, vol. 14, no. 2, pp. 418-434, 2018. DOI: 10.3745/JIPS.02.0080.

[ACM Style]
Fatiha Barigou. 2018. Impact of Instance Selection on kNN-Based Text Categorization. Journal of Information Processing Systems, 14, 2, (2018), 418-434. DOI: 10.3745/JIPS.02.0080.