Improved Character-Based Neural Network for POS Tagging on Morphologically Rich Languages


Samat Ali, Alim Murat, Journal of Information Processing Systems Vol. 19, No. 3, pp. 355-369, Jun. 2023  

10.3745/JIPS.02.0197
Keywords: Character Representation, deep neural network, Morphologically Rich Language, POS Tagging
Fulltext:

Abstract

Since the widespread adoption of deep-learning and related distributed representation, there have been substantial advancements in part-of-speech (POS) tagging for many languages. When training word representations, morphology and shape are typically ignored, as these representations rely primarily on collecting syntactic and semantic aspects of words. However, for tasks like POS tagging, notably in morphologically rich and resource-limited language environments, the intra-word information is essential. In this study, we introduce a deep neural network (DNN) for POS tagging that learns character-level word representations and combines them with general word representations. Using the proposed approach and omitting hand-crafted features, we achieve 90.47%, 80.16%, and 79.32% accuracy on our own dataset for three morphologically rich languages: Uyghur, Uzbek, and Kyrgyz. The experimental results reveal that the presented character-based strategy greatly improves POS tagging performance for several morphologically rich languages (MRL) where character information is significant. Furthermore, when compared to the previously reported state-of-the-art POS tagging results for Turkish on the METU Turkish Treebank dataset, the proposed approach improved on the prior work slightly. As a result, the experimental results indicate that character-based representations outperform word-level representations for MRL performance. Our technique is also robust towards the-out-of-vocabulary issues and performs better on manually edited text.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.




Cite this article
[APA Style]
Ali, S. & Murat, A. (2023). Improved Character-Based Neural Network for POS Tagging on Morphologically Rich Languages. Journal of Information Processing Systems, 19(3), 355-369. DOI: 10.3745/JIPS.02.0197.

[IEEE Style]
S. Ali and A. Murat, "Improved Character-Based Neural Network for POS Tagging on Morphologically Rich Languages," Journal of Information Processing Systems, vol. 19, no. 3, pp. 355-369, 2023. DOI: 10.3745/JIPS.02.0197.

[ACM Style]
Samat Ali and Alim Murat. 2023. Improved Character-Based Neural Network for POS Tagging on Morphologically Rich Languages. Journal of Information Processing Systems, 19, 3, (2023), 355-369. DOI: 10.3745/JIPS.02.0197.