I would like to select the top entries in a Pandas dataframe base on the entries of a specific column by using df_selected = df_targets.head(N)
.
Each entry has a target
value (by order of importance):
Likely Supporter, GOTV, Persuasion, Persuasion+GOTV
Unfortunately if I do
df_targets = df_targets.sort("target")
the ordering will be alphabetical (GOTV
,Likely Supporter
, ...).
I was hoping for a keyword like list_ordering
as in:
my_list = ["Likely Supporter", "GOTV", "Persuasion", "Persuasion+GOTV"]
df_targets = df_targets.sort("target", list_ordering=my_list)
To deal with this issue I create a dictionary:
dict_targets = OrderedDict()
dict_targets["Likely Supporter"] = "0 Likely Supporter"
dict_targets["GOTV"] = "1 GOTV"
dict_targets["Persuasion"] = "2 Persuasion"
dict_targets["Persuasion+GOTV"] = "3 Persuasion+GOTV"
, but it seems like a non-pythonic approach.
Suggestions would be much appreciated!
Thanks to jerzrael's input and references,
I like this sliced solution:
I think you need
Categorical
with parameterordered=True
and then sorting bysort_values
works very nice:If check documentation of
Categorical
:The method shown in my previous answer is now deprecated.
In stead it is best to use
pandas.Categorical
as shown here.So: