Semantic-Based Evaluation Framework for Topic Models: Integrated Deep Learning and LLM Validation


Seog-Min Lee, Journal of Information Processing Systems Vol. 22, No. 1, pp. 34-48, Feb. 2026  

https://doi.org/10.3745/JIPS.04.0365
Keywords: BERT Embeddings, Contemporary Topic Models, Deep Learning, LLM-based Evaluation, Semantic Evaluation Metrics
Fulltext:

Abstract

Topic modeling has evolved from statistical methods such as latent Dirichlet allocation (LDA) to neural hybrid models including BERTopic, which utilize bidirectional encoder representations from transformers (BERT) embeddings. However, traditional statistical evaluation metrics overlook the semantic richness of these neural representations, limiting model assessment capabilities. This paper introduces semantic-based evaluation metrics that leverage deep learning embeddings and validates them through both statistical comparison and large language model (LLM)-based assessment. This study evaluated three synthetic datasets with systematically varying topic overlap and one public dataset (20 Newsgroups). Analysis across 9,608 synthetic documents with 45 topics and a stratified sample of 1,000 documents from 20 Newsgroups shows that semantic metrics achieve improved discrimination compared to statistical baselines. Specifically, semantic coherence shows a 38.1% discriminative range versus 5.0% for statistical measures, representing a 7.62× improvement. Semantic distinctiveness achieves 1.57× higher discrimination than statistical methods. Semantic methods also maintain consistent discrimination quality for diversity metrics, with stable progression across similarity levels. LLM assessments, serving as proxies for human judgment, demonstrate inter-model agreement through a weighted three-model ensemble (mean pairwise Spearman ρ=0.937) and positive correlation with semantic metrics on public datasets (ρ=0.632–0.671). Domain-specific validation and multilingual extension constitute future work.


Statistics
Show / Hide Statistics

Statistics (Past 3 Years)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.




Cite this article
[APA Style]
Lee, S. (2026). Semantic-Based Evaluation Framework for Topic Models: Integrated Deep Learning and LLM Validation. Journal of Information Processing Systems, 22(1), 34-48. DOI: 10.3745/JIPS.04.0365.

[IEEE Style]
S. Lee, "Semantic-Based Evaluation Framework for Topic Models: Integrated Deep Learning and LLM Validation," Journal of Information Processing Systems, vol. 22, no. 1, pp. 34-48, 2026. DOI: 10.3745/JIPS.04.0365.

[ACM Style]
Seog-Min Lee. 2026. Semantic-Based Evaluation Framework for Topic Models: Integrated Deep Learning and LLM Validation. Journal of Information Processing Systems, 22, 1, (2026), 34-48. DOI: 10.3745/JIPS.04.0365.