A Novel Similarity Measure for Sequence Data


Mohammad. H. P, i, Omid Kashefi, Behrouz Minaei, Journal of Information Processing Systems Vol. 7, No. 3, pp. 413-424, Sep. 2011  

10.3745/JIPS.2011.7.3.413
Keywords: Sequence Data, Similarity Measure, Sequence Mining
Fulltext:

Abstract

A variety of different metrics has been introduced to measure the similarity of two given sequences. These widely used metrics are ranging from spell correctors and categorizers to new sequence mining applications. Different metrics consider different aspects of sequences, but the essence of any sequence is extracted from the ordering of its elements. In this paper, we propose a novel sequence similarity measure that is based on all ordered pairs of one sequence and where a Hasse diagram is built in the other sequence. In contrast with existing approaches, the idea behind the proposed sequence similarity metric is to extract all ordering features to capture sequence properties. We designed a clustering problem to evaluate our sequence similarity metric. Experimental results showed the superiority of our proposed sequence similarity metric in maximizing the purity of clustering compared to metrics such as d2, Smith-Waterman, Levenshtein, and Needleman-Wunsch. The limitation of those methods originates from some neglected sequence features, which are considered in our proposed sequence similarity metric.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from November 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.




Cite this article
[APA Style]
P, M., , Kashefi, O., & Minaei, B. (2011). A Novel Similarity Measure for Sequence Data. Journal of Information Processing Systems, 7(3), 413-424. DOI: 10.3745/JIPS.2011.7.3.413.

[IEEE Style]
M. H. P, i, O. Kashefi, B. Minaei, "A Novel Similarity Measure for Sequence Data," Journal of Information Processing Systems, vol. 7, no. 3, pp. 413-424, 2011. DOI: 10.3745/JIPS.2011.7.3.413.

[ACM Style]
Mohammad. H. P, i, Omid Kashefi, and Behrouz Minaei. 2011. A Novel Similarity Measure for Sequence Data. Journal of Information Processing Systems, 7, 3, (2011), 413-424. DOI: 10.3745/JIPS.2011.7.3.413.