How to split diagonal matrix into equal number of

I have a very large diagonal matrix that I need to split for parallel computation. Due to data locality issues it makes no sense to iterate through the matrix and split every n-th calculation between n threads. Currently, I am dividing k x k diagonal matrix in the following way but it yields unequal partitions in terms of the number of the calculations (smallest piece calculates a few times longer than the largest).

def split_matrix(k, n):
    split_points = [round(i * k / n) for i in range(n + 1)] 
    split_ranges = [(split_points[i], split_points[i + 1],) for i in range(len(split_points) - 1)]
    return split_ranges

import numpy as np
k = 100
arr = np.zeros((k,k,))
idx = 0
for i in range(k):
    for j in range(i + 1, k):
        arr[i, j] = idx
        idx += 1


def parallel_calc(array, k, si, endi):
     for i in range(si, endi):
         for j in range(k):
             # do some expensive calculations

for start_i, stop_i in split_matrix(k, cpu_cnt):
     parallel_calc(arr, k, start_i, stop_i)

Do you have any suggestions as to the implementation or library function?

标签： python algorithm numpy parallel-processing numerical-computing

2条回答

时光不老，我们不散

2楼-- · 2019-02-16 00:55

I thin you should update your split_matrix method, as it returns one split range less, than you want (setting cpu_cnt=4 will return only 3 tuples, and not 4):

def split_matrix(k, n):
    split_points = [round(i * k / n) for i in range(n+1)] 
    return [(split_points[i], split_points[i + 1],) for i in range(len(split_points) - 1)]

Edit: If your data locality is not so string you could try this: create a queue of tasks, in which you add all indices/entries for which this calculation shall be performed. Then you initialize your parallel workers (e.g. using multiprocessing) and let them start. This worker now pick a element out of the queue, calculate the result, store it (e.g. in another queue) and continue with the next item, and so on.

If this is not working for your data, I don't think, that you can improve anymore.

0人赞添加讨论(0) 举报

贪生不怕死

3楼-- · 2019-02-16 01:09

After a number of geometrical calculations on a side I arrived at the following partitioning that gives roughly the same number of points of the matrix in each of the vertical (or horizontal, if one wants) partitions.

def offsets_for_equal_no_elems_diag_matrix(matrix_dims, num_of_partitions):
    if 2 == len(matrix_dims) and matrix_dims[0] == matrix_dims[1]:  # square
        k = matrix_dims[0]
        # equilateral right angle triangles have area of side**2/2 and from this area == 1/num_of_partitions * 1/2 * matrix_dim[0]**2 comes the below
        # the k - ... comes from the change in the axis (for the calc it is easier to start from the smallest triangle piece)
        div_points = [0, ] + [round(k * math.sqrt((i + 1)/num_of_partitions)) for i in range(num_of_partitions)]
        pairs = [(k - div_points[i + 1], k - div_points[i], ) for i in range(num_of_partitions - 1, -1, -1)]
        return pairs

0人赞添加讨论(0) 举报

How to split diagonal matrix into equal number of

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间