why converting list into set in Cython takes so mu

2019-09-13 23:17发布

huge_list parameter is something like [[12,12,14],[43,356,23]]. And my code to convert list to set is:

cpdef list_to_set(list huge_list):
    cdef list ids
    cdef list final_ids=[]
    for ids in huge_list:
        final_ids.append(set(ids))

    return final_ids

I have 2800 list elements, each has 30,000 id. It takes around 19 second. How to improve performance?

EDIT 1:
Instead of set I used unique in numpy as below and numpy speeds up by ~7 seconds:

df['ids'] = df['ids'].apply(lambda x: numpy.unique(x))

Now it takes 14 seconds (Previously it was ~20 seconds). I don't think this time is acceptable yet. :|

标签： python list set cython

2条回答

啃猪蹄的小仙女

2楼-- · 2019-09-13 23:46

Cython cannot speed up anything. The most time is spent building sets, e.g. calculating hash values of your elements and storing them in maps. This is already done in C, so no speed up possible. The pure python version:

final_ids = [set(ids) for ids in huge_list]

whould lead to the same result.

0人赞添加讨论(0) 举报

Bombasti

3楼-- · 2019-09-14 00:00

If you just want to convert the nested lists to set you can simply use map function :

final_ids=map(set,huge_list)

0人赞添加讨论(0) 举报

why converting list into set in Cython takes so mu

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间