SVD-LDA: A Combined Model for Text Classification


Nguyen Cao Truong Hai, Kyung-Im Kim, Hyuk-Ro Park, Journal of Information Processing Systems Vol. 5, No. 1, pp. 5-10, Mar. 2009  

10.3745/JIPS.2009.5.1.005
Keywords: Latent Dirichlet Allocation, Singular Value Decomposition, Input Filtering, Text Classification, Data Preprocessing.
Fulltext:

Abstract

Text data has always accounted for a major portion of the world¡¯s information. As the volume of information increases exponentially, the portion of text data also increases significantly. Text classification is therefore still an important area of research. LDA is an updated, probabilistic model which has been used in many applications in many other fields. As regards text data, LDA also has many applications, which has been applied various enhancements. However, it seems that no applications take care of the input for LDA. In this paper, we suggest a way to map the input space to a reduced space, which may avoid the unreliability, ambiguity and redundancy of individual terms as descriptors. The purpose of this paper is to show that LDA can be perfectly performed in a ¡°clean and clear¡± space. Experiments are conducted on 20 News Groups data sets. The results show that the proposed method can boost the classification results when the appropriate choice of rank of the reduced space is determined.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.




Cite this article
[APA Style]
Hai, N., Kim, K., & Park, H. (2009). SVD-LDA: A Combined Model for Text Classification. Journal of Information Processing Systems, 5(1), 5-10. DOI: 10.3745/JIPS.2009.5.1.005.

[IEEE Style]
N. C. T. Hai, K. Kim, H. Park, "SVD-LDA: A Combined Model for Text Classification," Journal of Information Processing Systems, vol. 5, no. 1, pp. 5-10, 2009. DOI: 10.3745/JIPS.2009.5.1.005.

[ACM Style]
Nguyen Cao Truong Hai, Kyung-Im Kim, and Hyuk-Ro Park. 2009. SVD-LDA: A Combined Model for Text Classification. Journal of Information Processing Systems, 5, 1, (2009), 5-10. DOI: 10.3745/JIPS.2009.5.1.005.