Word Similarity Calculation by Using the Edit Distance Metrics with Consonant Normalization


Seung-Shik Kang, Journal of Information Processing Systems Vol. 11, No. 4, pp. 573-582, Aug. 2015  

10.3745/JIPS.04.0018
Keywords: Consonant Normalization, Edit Distance, Korean Character, Normalization Factor
Fulltext:

Abstract

Edit distance metrics are widely used for many applications such as string comparison and spelling error corrections. Hamming distance is a metric for two equal length strings and Damerau-Levenshtein distance is a well-known metrics for making spelling corrections through string-to-string comparison. Previous distance metrics seems to be appropriate for alphabetic languages like English and European languages. However, the conventional edit distance criterion is not the best method for agglutinative languages like Korean. The reason is that two or more letter units make a Korean character, which is called as a syllable. This mechanism of syllable-based word construction in the Korean language causes an edit distance calculation to be inefficient. As such, we have explored a new edit distance method by using consonant normalization and the normalization factor.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.




Cite this article
[APA Style]
Seung-Shik Kang (2015). Word Similarity Calculation by Using the Edit Distance Metrics with Consonant Normalization. Journal of Information Processing Systems, 11(4), 573-582. DOI: 10.3745/JIPS.04.0018.

[IEEE Style]
S. Kang, "Word Similarity Calculation by Using the Edit Distance Metrics with Consonant Normalization," Journal of Information Processing Systems, vol. 11, no. 4, pp. 573-582, 2015. DOI: 10.3745/JIPS.04.0018.

[ACM Style]
Seung-Shik Kang. 2015. Word Similarity Calculation by Using the Edit Distance Metrics with Consonant Normalization. Journal of Information Processing Systems, 11, 4, (2015), 573-582. DOI: 10.3745/JIPS.04.0018.