Different results and performances with different

I'm comparing the libraries dtaidistance, fastdtw and cdtw for DTW computations. This is my code:

from fastdtw import fastdtw
from cdtw import pydtw
import fastdtw
import array
from timeit import default_timer as timer
from dtaidistance import dtw, dtw_visualisation as dtwvis

s1 = mySampleSequences[0] # first sample sequence consisting of 3000 samples
s2 = mySampleSequences[1] # second sample sequence consisting of 3000 samples

start = timer()
distance1 = dtw.distance(s1, s2)
end = timer()
start2 = timer()
distance2 = dtw.distance_fast(array.array('d',s1),array.array('d',s2))
end2 = timer()
start3 = timer()
distance3, path3 = fastdtw(s1,s2)
end3 = timer()
start4 = timer()
distance4 = pydtw.dtw(s1,s2).get_dist()
end4 = timer()

print("dtw.distance(x,y) time: "+ str(end - start))
print("dtw.distance(x,y) distance: "+str(distance1))
print("dtw.distance_fast(x,y) time: "+ str(end2 - start2))
print("dtw.distance_fast(x,y) distance: " + str(distance2))
print("fastdtw(x,y) time: "+ str(end3 - start3))
print("fastdtw(x,y) distance: " + str(distance3))
print("pydtw.dtw(x,y) time: "+ str(end4 - start4))
print("pydtw.dtw(x,y) distance: " + str(distance4))

This is the output I get:

dtw.distance(x,y) time: 22.16925272245262
dtw.distance(x,y) distance: 1888.8583853746156
dtw.distance_fast(x,y) time: 0.3889036471839056
dtw.distance_fast(x,y) distance: 1888.8583853746156
fastdtw(x,y) time: 0.23296659641047412
fastdtw(x,y) distance: 27238.0
pydtw.dtw(x,y) time: 0.13706478039556558
pydtw.dtw(x,y) distance: 17330.0

My question is: Why do I get different performances and different distances? Thank you very much for your comments.

// edit: The unit of the time measurements is seconds.

标签： python dtw

1条回答

放我归山

2楼-- · 2019-03-01 05:29

Edit: what are the units of the time measurements? I believe that you compared them as they were all in the same unit. Probably the dtw.distance is, for example, in microseconds, while the other answers are in milliseconds, and you thought that dtw.distance performed slower, when it is actually the opposite.

There are different methodologies to measure the distance between two points. It could be based on standard deviation or just euclidian distance. Here is a list of many of those distance.

Some of them might be more computational intensive than others, and also have different meanings. Fast dtw, for example, uses as a third input the type of distance that you want, as described on their github

distance3, path3 = fastdtw(s1, s2, dist = euclidean)

Another reason for the speed difference is the underlying code. Some of them are in pure python, while others are in C, which can be easily 100x faster. A way to speed up your dtaidistance is to set a maximum distance threshold. The algorithm will stop the calculation if it realizes that the total distance will be above a certain value:

distance2 = dtw.distance_fast(array.array('d',s1),array.array('d',s2), max_dist = your_threshold)

It is also important to note that some might be optimized for longer or shorter arrays. Looking at the example below and running it in my computer, I find different results:

from cdtw import pydtw
from dtaidistance import dtw
from fastdtw import fastdtw
from scipy.spatial.distance import euclidean
s1=np.array([1,2,3,4],dtype=np.double)
s2=np.array([4,3,2,1],dtype=np.double)

%timeit dtw.distance_fast(s1, s2)
4.1 µs ± 28.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit d2 = pydtw.dtw(s1,s2,pydtw.Settings(step = 'p0sym', window = 'palival', param = 2.0, norm = False, compute_path = True)).get_dist()
45.6 µs ± 3.39 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit d3,_=fastdtw(s1, s2, dist=euclidean)
901 µs ± 9.95 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

fastdtw is 219 times slower than dtaidistance lib and 20x slower than cdtw

0人赞添加讨论(0) 举报

Different results and performances with different

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间