A Hierarchical Text Rating System for Objectionable Documents


Chi Yoon Jeong, Seung Wan Han, Taek Yong Nam, Journal of Information Processing Systems Vol. 1, No. 1, pp. 22-26, Dec. 2005  


Keywords: Objectionable documents, document analysis, Text Classification, hierarchical system, SVM
Fulltext:

Abstract

In this paper, we classified the objectionable texts into four rates according to their harmfulness and proposed the hierarchical text rating system for objectionable documents. Since the documents in the same category have similarities in used words, expressions and structure of the document, the text rating system, which uses a single classification model, has low accuracy. To solve this problem, we separate objectionable documents into several subsets by using their properties, and then classify the subsets hierarchically. The proposed system consists of three layers. In each layer, we select features using the chi-square statistics, and then the weight of the features, which is calculated by using the TF-IDF weighting scheme, is used as an input of the non-linear SVM classifier. By means of a hierarchical scheme using the different features and the different number of features in each layer, we can characterize the objectionability of documents more effectively and expect to improve the performance of the rating system. We compared the performance of the proposed system and performance of several text rating systems and experimental results show that the proposed system can archive an excellent classification performance.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.




Cite this article
[APA Style]
Jeong, C., Han, ., & Nam, . (2005). A Hierarchical Text Rating System for Objectionable Documents. Journal of Information Processing Systems, 1(1), 22-26. DOI: .

[IEEE Style]
C. Y. Jeong, , , "A Hierarchical Text Rating System for Objectionable Documents," Journal of Information Processing Systems, vol. 1, no. 1, pp. 22-26, 2005. DOI: .

[ACM Style]
Chi Yoon Jeong, Seung Wan Han, and Taek Yong Nam. 2005. A Hierarchical Text Rating System for Objectionable Documents. Journal of Information Processing Systems, 1, 1, (2005), 22-26. DOI: .