NumPy: vectorize sum of distances to a set of poin

2019-07-25 00:10发布

I'm trying to implementing a k-medoids clustering algorithm in Python/NumPy. As part of this algo, I have to compute the sum of distances from objects to their "medoids" (cluster representatives).

I have: a distance matrix on five points

n_samples = 5
D = np.array([[ 0.        ,  3.04959014,  4.74341649,  3.72424489,  6.70298441],
              [ 3.04959014,  0.        ,  5.38516481,  4.52216762,  6.16846821],
              [ 4.74341649,  5.38516481,  0.        ,  1.02469508,  8.23711114],
              [ 3.72424489,  4.52216762,  1.02469508,  0.        ,  7.69025357],
              [ 6.70298441,  6.16846821,  8.23711114,  7.69025357,  0.        ]])

a set of initial medoids

medoids = np.array([0, 3])

and the cluster memberships

cl = np.array([0, 0, 1, 1, 0])

I can compute the required sum using

>>> np.sum(D[i, medoids[cl[i]]] for i in xrange(n_samples))
10.777269622938899

but that uses a Python loop. Am I missing some kind of vectorized idiom for computing this sum?

标签： python numpy vectorization

1条回答

孤傲高冷的网名

2楼-- · 2019-07-25 01:02

How about:

In [17]: D[np.arange(n_samples),medoids[cl]].sum()
Out[17]: 10.777269629999999

0人赞添加讨论(0) 举报

NumPy: vectorize sum of distances to a set of poin

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间