Get unique values from a list in python [duplicate

2018-12-31 14:15发布

This question already has an answer here:

I want to get the unique values from the following list:

[u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']

The output which I require is:

[u'nowplaying', u'PBS', u'job', u'debate', u'thenandnow']

This code works:

output = []
for x in trends:
    if x not in output:
        output.append(x)
print output

is there a better solution I should use?

标签: python
30条回答
情到深处是孤独
2楼-- · 2018-12-31 14:33

To get unique values from your list use code below:

trends = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
output = set(trends)
output = list(output)

IMPORTANT: Approach above won't work if any of items in a list is not hashable which is case for mutable types, for instance list or dict.

trends = [{'super':u'nowplaying'}, u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
output = set(trends)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  TypeError: unhashable type: 'dict'

That means that you have to be sure that trends list would always contains only hashable items otherwise you have to use more sophisticated code:

from copy import deepcopy

try:
    trends = [{'super':u'nowplaying'}, [u'PBS',], [u'PBS',], u'nowplaying', u'job', u'debate', u'thenandnow', {'super':u'nowplaying'}]
    output = set(trends)
    output = list(output)
except TypeError:
    trends_copy = deepcopy(trends)
    while trends_copy:
        trend = trends_copy.pop()
        if trends_copy.count(trend) == 0:
            output.append(trend)
print output
查看更多
千与千寻千般痛.
3楼-- · 2018-12-31 14:33

Set is a collection of ordered and unique elements. So, you can use set as below to get a unique list:

unique_list = list(set([u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']))
查看更多
孤独总比滥情好
4楼-- · 2018-12-31 14:34

what type is your output variable?

Python sets are what you just need. Declare output like this:

output = set([]) # initialize an empty set

and you're ready to go adding elements with output.add(elem) and be sure they're unique.

Warning: sets DO NOT preserve the original order of the list.

查看更多
只靠听说
5楼-- · 2018-12-31 14:34

You can use sets. Just to be clear, I am explaining what is the difference between a list and a set. sets are unordered collection of unique elements.Lists are ordered collection of elements. So,

    unicode_list=[u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job',u'debate', u'thenandnow']
    list_unique=list(set(unicode_list))
    print list_unique
[u'nowplaying', u'job', u'debate', u'PBS', u'thenandnow']

But: Do not use list/set in naming the variables. It will cause error: EX: Instead of use list instead of unicode_list in the above one.

list=[u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job',u'debate', u'thenandnow']
        list_unique=list(set(list))
        print list_unique
    list_unique=list(set(list))
TypeError: 'list' object is not callable
查看更多
初与友歌
6楼-- · 2018-12-31 14:35

Maintaining order:

# oneliners
# slow -> . --- 14.417 seconds ---
[x for i, x in enumerate(array) if x not in array[0:i]]

# fast -> . --- 0.0378 seconds ---
[x for i, x in enumerate(array) if array.index(x) == i]

# multiple lines
# fastest -> --- 0.012 seconds ---
uniq = []
[uniq.append(x) for x in array if x not in uniq]
uniq

Order doesn't matter:

# fastest-est -> --- 0.0035 seconds ---
list(set(array))
查看更多
冷夜・残月
7楼-- · 2018-12-31 14:35

If you are using numpy in your code (which might be a good choice for larger amounts of data), check out numpy.unique:

>>> import numpy as np
>>> wordsList = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
>>> np.unique(wordsList)
array([u'PBS', u'debate', u'job', u'nowplaying', u'thenandnow'], 
      dtype='<U10')

(http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html)

As you can see, numpy supports not only numeric data, string arrays are also possible. Of course, the result is a numpy array, but it doesn't matter a lot, because it still behaves like a sequence:

>>> for word in np.unique(wordsList):
...     print word
... 
PBS
debate
job
nowplaying
thenandnow

If you really want to have a vanilla python list back, you can always call list().

However, the result is automatically sorted, as you can see from the above code fragments. Check out numpy unique without sort if retaining list order is required.

查看更多
登录 后发表回答