Korean Phoneme Boundary Detection Based on Time Domain Metrics


Jae Won Lee, Journal of Information Processing Systems Vol. 20, No. 6, pp. 746-757, Dec. 2024  

https://doi.org/10.3745/JIPS.04.0325
Keywords: Bulk Metrics, Phoneme Boundary Detection, speech recognition, Volatility Metric
Fulltext:

Abstract

This paper proposes a novel Korean phoneme boundary detection method that can be applied to phoneme-based Korean speech recognition systems. The proposed method employs two time-domain metrics—volatility and bulk metrics—as the foundation for phoneme boundary detection. The input speech signal is divided into blocks of 300 integer samples. For each block, the volatility metric is computed that adds up all the changes between neighboring samples within the block. A bulk is a grouping of consecutive samples with the same sign. For each bulk, two bulk metrics are calculated: bulk size and bulk length. Three dedicated algorithms that utilize both types of metrics are used to detect phoneme boundaries by recognizing vowels, voiced consonants, and voiceless consonants in turn. The experimental results show that the proposed method can significantly reduce the error rate compared to an existing boundary detection method.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.




Cite this article
[APA Style]
Lee, J. (2024). Korean Phoneme Boundary Detection Based on Time Domain Metrics. Journal of Information Processing Systems, 20(6), 746-757. DOI: 10.3745/JIPS.04.0325.

[IEEE Style]
J. W. Lee, "Korean Phoneme Boundary Detection Based on Time Domain Metrics," Journal of Information Processing Systems, vol. 20, no. 6, pp. 746-757, 2024. DOI: 10.3745/JIPS.04.0325.

[ACM Style]
Jae Won Lee. 2024. Korean Phoneme Boundary Detection Based on Time Domain Metrics. Journal of Information Processing Systems, 20, 6, (2024), 746-757. DOI: 10.3745/JIPS.04.0325.