Chinese Long-Text Classification Strategy Based on Fusion Features


Li Lin, Yidan Wang, Yan Wang, Changhua Tang, Journal of Information Processing Systems Vol. 20, No. 6, pp. 758-766, Dec. 2024  

https://doi.org/10.3745/JIPS.02.0221
Keywords: Attention Mechanism, Bert model, Fused Features, Neural Network
Fulltext:

Abstract

In the process of Chinese long-text classification, due to the large amount of text data and complex features, methods suitable for ordinary text classification often lack sufficient accuracy, which directly leads to frequent classification failures in long-text environments. To solve this problem, the research designed a bi-directional long short-term memory (Bi-LSTM) model that combines forward and backward operations and utilized attention mechanisms to improve fusion. At the same time, the bi-directional encoder representations from transformers (BERT) model was introduced into the text processing to form a long-text classification model. Finally, different datasets were tested to verify the actual classification effect of the model. The research results showed that under different dataset environments, the classification accuracy rates of the designed models were 92.93% and 93.77%, respectively, which are the models with the highest classification accuracy rates among the same type of models. The calculation time was 85.42 seconds and 117.51 seconds, respectively, which are the models with the shortest calculation time among the same type of models. It can be seen that the research designed long-text classification model innovatively combined the BERT model, convolutional neural network model, Bi-LSTM model, and attention mechanism structure based on the data characteristics of long-text classification, enabling the model to achieve higher classification accuracy in a shorter computational time. Moreover, it has better classification results in actual long-text classification, overcomes the classification failure problem caused by complex text features in the long-text classification environment, and provides a possibility for long-text specific classification paths.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.




Cite this article
[APA Style]
Lin, L., Wang, Y., Wang, Y., & Tang, C. (2024). Chinese Long-Text Classification Strategy Based on Fusion Features. Journal of Information Processing Systems, 20(6), 758-766. DOI: 10.3745/JIPS.02.0221.

[IEEE Style]
L. Lin, Y. Wang, Y. Wang, C. Tang, "Chinese Long-Text Classification Strategy Based on Fusion Features," Journal of Information Processing Systems, vol. 20, no. 6, pp. 758-766, 2024. DOI: 10.3745/JIPS.02.0221.

[ACM Style]
Li Lin, Yidan Wang, Yan Wang, and Changhua Tang. 2024. Chinese Long-Text Classification Strategy Based on Fusion Features. Journal of Information Processing Systems, 20, 6, (2024), 758-766. DOI: 10.3745/JIPS.02.0221.