This question already has an answer here:
- Removing duplicates in lists 43 answers
I want to get the unique values from the following list:
[u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
The output which I require is:
[u'nowplaying', u'PBS', u'job', u'debate', u'thenandnow']
This code works:
output = []
for x in trends:
if x not in output:
output.append(x)
print output
is there a better solution I should use?
As a bonus,
Counter
is a simple way to get both the unique values and the count for each value:First declare your list properly, separated by commas. You can get the unique values by converting the list to a set.
If you use it further as a list, you should convert it back to list by doing:
Another possibility, probably faster would be to use a set from the beginning, instead of a list. Then your code should be:
As it has been pointed out, the sets do not maintain the original order. If you need so, you should look up about the ordered set.
If we need to keep the elements order, how about this:
And one more solution using
reduce
and without the temporaryused
var.UPDATE - Oct 1, 2016
Another solution with
reduce
, but this time without.append
which makes it more human readable and easier to understand.NOTE: Have in mind that more human-readable we get, more unperformant the script is.
ANSWERING COMMENTS
Because @monica asked a good question about "how is this working?". For everyone having problems figuring it out. I will try to give a more deep explanation about how this works and what sorcery is happening here ;)
So she first asked:
Well it's actually working
The problem is that we are just not getting the desired results inside the
unique
variable, but only inside theused
variable. This is because during the list comprehension.append
modifies theused
variable and returnsNone
.So in order to get the results into the
unique
variable, and still use the same logic with.append(x) if x not in used
, we need to move this.append
call on the right side of the list comprehension and just returnx
on the left side.But if we are too naive and just go with:
We will get nothing in return.
Again, this is because the
.append
method returnsNone
, and it this gives on our logical expression the following look:This will basically always:
False
whenx
is inused
,None
whenx
is not inused
.And in both cases (
False
/None
), this will be treated asfalsy
value and we will get an empty list as a result.But why this evaluates to
None
whenx
is not inused
? Someone may ask.Well it's because this is how Python's short-circuit operators works.
So when
x
is not in used (i.e. when itsTrue
) the next part or the expression will be evaluated (used.append(x)
) and its value (None
) will be returned.But that's what we want in order to get the unique elements from a list with duplicates, we want to
.append
them into a new list only when we they came across for a fist time.So we really want to evaluate
used.append(x)
only whenx
is not inused
, maybe if there is a way to turn thisNone
value into atruthy
one we will be fine, right?Well, yes and here is where the 2nd type of
short-circuit
operators come to play.We know that
.append(x)
will always befalsy
, so if we just add oneor
next to him, we will always get the next part. That's why we write:so we can evaluate
used.append(x)
and getTrue
as a result, only when the first part of the expression(x not in used)
isTrue
.Similar fashion can be seen in the 2nd approach with the
reduce
method.where we:
x
tol
and return thatl
whenx
is not inl
. Thanks to theor
statement.append
is evaluated andl
is returned after that.l
untouched whenx
is inl
I am surprised that nobody so far has given a direct order-preserving answer:
It will generate the values so it works with more than just lists, e.g.
unique(range(10))
. To get a list, just calllist(unique(sequence))
, like this:It has the requirement that each item is hashable and not just comparable, but most stuff in Python is and it is O(n) and not O(n^2), so will work just fine with a long list.
set - unordered collection of unique elements. List of elements can be passed to set's constructor. So, pass list with duplicate elements, we get set with unique elements and transform it back to list then get list with unique elements. I can say nothing about performance and memory overhead, but I hope, it's not so important with small lists.
Simply and short.
I know this is an old question, but here's my unique solution: class inheritance!:
Then, if you want to uniquely append items to a list you just call appendunique on a UniqueList. Because it inherits from a list, it basically acts like a list, so you can use functions like index() etc. And because it returns true or false, you can find out if appending succeeded (unique item) or failed (already in the list).
To get a unique list of items from a list, use a for loop appending items to a UniqueList (then copy over to the list).
Example usage code:
Prints:
Copying to list:
Prints: