delete the first element in subview of a matrix

I have a dataset like this:

[[0,1],
 [0,2],
 [0,3],
 [0,4],
 [1,5],
 [1,6],
 [1,7],
 [2,8],
 [2,9]]

I need to delete the first elements of each subview of the data as defined by the first column. So first I get all elements that have 0 in the first column, and delete the first row: [0,1]. Then I get the elements with 1 in the first column and delete the first row [1,5], next step I delete [2,8] and so on and so forth. In the end, I would like to have a dataset like this:

[[0,2],
 [0,3],
 [0,4],
 [1,6],
 [1,7],
 [2,9]]

EDIT: Can this be done in numpy? My dataset is very large so for loops on all elements take at least 4 minutes to complete.

标签： python numpy

5条回答

啃猪蹄的小仙女

2楼-- · 2020-07-11 06:15

a = [[0,1],
 [0,2],
 [0,3],
 [0,4],
 [1,5],
 [1,6],
 [1,7],
 [2,8],
 [2,9]]

a = [y for x in itertools.groupby(a, lambda x: x[0]) for y in list(x[1])[1:]]

print a

0人赞添加讨论(0) 举报

一夜七次

3楼-- · 2020-07-11 06:17

My answer is :

from operator import itemgetter
sorted(l, key=itemgetter(1))  # fist sort by fist element of inner list 
nl = []
[[0, 1], [0, 2], [0, 3], [0, 4], [1, 5], [1, 6], [1, 7], [2, 8], [2, 9]]
j = 0;
for i in range(len(l)): 
    if(j == l[i][0]):
        j = j + 1   # skip element 
    else:
        nl.append(l[i])  # otherwise append  in new list

output is:

>>> nl
[[0, 2], [0, 3], [0, 4], [1, 6], [1, 7], [2, 9]]

0人赞添加讨论(0) 举报

劫难

4楼-- · 2020-07-11 06:24

As requested, a numpy solution:

import numpy as np
a = np.array([[0,1], [0,2], [0,3], [0,4], [1,5], [1,6], [1,7], [2,8], [2,9]])
_,i = np.unique(a[:,0], return_index=True)

b = np.delete(a, i, axis=0)

(above is edited to incorporate @Jaime's solution, here is my original masking solution for posterity's sake)

m = np.ones(len(a), dtype=bool)
m[i] = False
b = a[m]

Interestingly, the mask seems to be faster:

In [225]: def rem_del(a):
   .....:     _,i = np.unique(a[:,0], return_index=True)
   .....:     return np.delete(a, i, axis = 0)
   .....: 

In [226]: def rem_mask(a):
   .....:     _,i = np.unique(a[:,0], return_index=True)
   .....:     m = np.ones(len(a), dtype=bool)
   .....:     m[i] = False
   .....:     return a[m]
   .....: 

In [227]: timeit rem_del(a)
10000 loops, best of 3: 181 us per loop

In [228]: timeit rem_mask(a)
10000 loops, best of 3: 59 us per loop

0人赞添加讨论(0) 举报

男人必须洒脱

5楼-- · 2020-07-11 06:27

You want to use itertools.groupby() with a dash of itertools.islice() and itertools.chain:

from itertools import islice, chain, groupby
from operator import itemgetter

list(chain.from_iterable(islice(group, 1, None)
                         for key, group in groupby(inputlist, key=itemgetter(0))))

The groupby() call groups the input list into chunks where the first item is the same (itemgetter(0) is the grouping key).
The islice(group, 1, None) call turns the groups into iterables where the first element will be skipped.
The chain.from_iterable() call takes each islice() result and chains them together into a new iterable, which list() turns back into a list.

Demo:

>>> list(chain.from_iterable(islice(group, 1, None) for key, group in groupby(inputlist, key=itemgetter(0))))
[[0, 2], [0, 3], [0, 4], [1, 6], [1, 7], [2, 9]]

0人赞添加讨论(0) 举报

太酷不给撩

6楼-- · 2020-07-11 06:33

Pass in your lists and the key that you want to check values on.

def getsubset(set, index):
    hash = {}
    for list in set:
        if not list[index] in hash:
            set.remove(list)
            hash[list[index]]  = list

    return set

0人赞添加讨论(0) 举报

delete the first element in subview of a matrix

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间