如何斌矩阵(How to bin a matrix)

numpy.histogram（数据，频段）是计算在由阵列箱限定的仓中的数据阵列的下降多少个元素非常快速和有效的方式。是否有同等功能的解决以下问题？我有R行C时代列的矩阵。我想斌利用箱给出的定义矩阵中的每一行。结果应该是与R行进一步的矩阵，并且用柱等于频段的数量的数量。

我试图使用该函数numpy.histogram（数据，频段），得到作为输入的矩阵，但我发现，基质为有R * C元素的数组进行处理。然后，其结果是与Nbins元素的数组。

Answer 1:

如果你申请这有许多行的数组这个功能会给你一些暂时性的内存的成本有些加快。

def hist_per_row(data, bins):

    data = np.asarray(data)

    assert np.all(bins[:-1] <= bins[1:])
    r, c = data.shape
    idx = bins.searchsorted(data)
    step = len(bins) + 1
    last = step * r
    idx += np.arange(0, last, step).reshape((r, 1))
    res = np.bincount(idx.ravel(), minlength=last)
    res = res.reshape((r, step))
    return res[:, 1:-1]

该res[:, 1:-1]的最后一行是要与numpy.histogram返回与LEN的数组一致len(bins) - 1 ，但如果你要计算的是小于值，你可以拖放和大于bins[0]和bins[-1]分别。

Answer 2:

谢谢大家对你的答案和评论。最后，我找到了一种方法，以加快合并程序。而不是使用的np.searchsorted(data) ，我做np.array(data*nbins, dtype=int) 在代发表毕波多黎各代码这一行，我发现它变成了3倍速度更快。在这里我下面用毕后波多黎各的功能与我的修改，以便其他用户可以很容易地把它。

def hist_per_row(data, bins):

    data = np.asarray(data)
    assert np.all(bins[:-1] <= bins[1:])
    r, c = data.shape

    nbins = len(bins)-1
    data = data/bins[-1]
    idx = array(data*nbins, dtype=int)+1

    step = len(bins) + 1
    last = step * r
    idx += np.arange(0, last, step).reshape((r, 1))
    res = np.bincount(idx.ravel(), minlength=last)
    res = res.reshape((r, step))
    return res[:, 1:-1]

Answer 3:

沿着这些路线的东西吗？

import numpy as np
data = np.random.rand(10,20)
print np.apply_along_axis(lambda x: np.histogram(x)[0], 1, data)

文章来源: How to bin a matrix