Common Dissimilarity Measures are Inappropriate for Time Series Clustering

Cássio Martini Martins Pereira, Rodrigo F. de Mello

Abstract


Clustering algorithms have been actively used to identify similar timeseries, providing a better understanding of data. However, common clustering dis-similarity measures disregard time series correlations, yielding poor results. In thispaper, we introduce a dissimilarity measure based on series partial autocorrelations.Experiments compare hierarchical clustering algorithms using the common dissimi-larity measures, such as Euclidean Distance and Dynamic Time Warping, to clustertime series following Box-Jenkins Auto-Regressive models. Results show that ourdissimilarity measure produces better results for both synthetic and real data sets interms of the Adjusted Rand Index and Normalized Hubert Γ statistic. Our findingsconfirm that the choice of dissimilarity measure is crucial for improving time seriesclustering quality.



DOI: https://doi.org/10.22456/2175-2745.25070

Copyright (c) 2018 Cássio Martini Martins Pereira, Rodrigo F. de Mello

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Indexing databases:
        

Acknowledgments: