Time series distance metric

2020-06-15 05:23发布

In order to clusterize a set of time series I'm looking for a smart distance metric. I've tried some well known metric but no one fits to my case.

ex: Let's assume that my cluster algorithm extracts this three centroids [s1, s2, s3]:

I want to put this new example [sx] in the most similar cluster:

The most similar centroids is the second one, so I need to find a distance function d that gives me d(sx, s2) < d(sx, s1) and d(sx, s2) < d(sx, s3)

edit

Here the results with metrics [cosine, euclidean, minkowski, dynamic type warping] enter image description here ]3

edit 2

User Pietro P suggested to apply the distances on the cumulated version of the time series The solution works, here the plots and the metrics:

标签： time-series distance hierarchical-clustering dtw

2条回答

够拽才男人

2楼-- · 2020-06-15 06:08

nice question! using any standard distance of R^n (euclidean, manhattan or generically minkowski) over those time series cannot achieve the result you want, since those metrics are independent of the permutations of the coordinate of R^n (while time is strictly ordered and it is the phenomenon you want to capture).

A simple trick, that can do what you ask is using the cumulated version of the time series (sum values over time as time increases) and then apply a standard metric. Using the Manhattan metric, you would get as a distance between two time series the area between their cumulated versions.

0人赞添加讨论(0) 举报

聊天终结者

3楼-- · 2020-06-15 06:11

what about using standard Pearson correlation coefficient? then you can assign the new point to the cluster with the highest coefficient.

correlation = scipy.stats.pearsonr(<new time series>, <centroid>)

0人赞添加讨论(0) 举报

Time series distance metric

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间