I have a dataframe along the lines of the below:
Type Set
1 A Z
2 B Z
3 B X
4 C Y
I want to add another column to the dataframe (or generate a series) of the same length as the dataframe (= equal number of records/rows) which sets a colour green if Set = 'Z' and 'red' if Set = otherwise.
What's the best way to do this?
List comprehension is another way to create another column conditionally. If you are working with object dtypes in columns, like in your example, list comprehensions typically outperform most other methods.
Example list comprehension:
%timeit tests:
Here's yet another way to skin this cat, using a dictionary to map new values onto the keys in the list:
What's it look like:
This approach can be very powerful when you have many
ifelse
-type statements to make (i.e. many unique values to replace).And of course you could always do this:
But that approach is more than three times as slow as the
apply
approach from above, on my machine.And you could also do this, using
dict.get
:The following is slower than the approaches timed here, but we can compute the extra column based on the contents of more than one column, and more than two values can be computed for the extra column.
Simple example using just the "Set" column:
Example with more colours and more columns taken into account:
Another way in which this could be achieved is
If you only have two choices to select from:
For example,
yields
If you have more than two conditions then use
np.select
. For example, if you wantcolor
to beyellow
when(df['Set'] == 'Z') & (df['Type'] == 'A')
blue
when(df['Set'] == 'Z') & (df['Type'] == 'B')
purple
when(df['Type'] == 'B')
black
,then use
which yields