How do you do group by and pivot tables with Julia Dataframes?
Lets say I have Dataframe
using DataFrames
df =DataFrame(Location = [ "NY", "SF", "NY", "NY", "SF", "SF", "TX", "TX", "TX", "DC"],
Class = ["H","L","H","L","L","H", "H","L","L","M"],
Address = ["12 Silver","10 Fak","12 Silver","1 North","10 Fak","2 Fake", "1 Red","1 Dog","2 Fake","1 White"],
Score = ["4","5","3","2","1","5","4","3","2","1"])
and I want to do the following:
1) a pivot table with Location
and Class
which should output
Class H L M
Location
DC 0 0 1
NY 2 1 0
SF 1 2 0
TX 1 2 0
2) group by "Location" and a count on the number of records in that group, which should output
Pop
DC 1
NY 3
SF 3
TX 3
For part 2 of your question, you can use an anonymous function and return a DataFrame, in order to name the new column, for example as
count
:You can use
unstack
to get you most of the way (DataFrames don't have an index so Class has to remain a column, rather than in pandas where it would be an Index), this seems to be DataFrames.jl's answer topivot_table
:I'm not sure how you
fillna
here (unstack doesn't have this option)...You can do the groupby using
by
with thenrows
(number of rows) method:Using the
pivot (df, rowFields, colField, valuesField; <keyword arguments>)
function developed for this SO question you could do:First question:
Second question:
Package FreqTable.jl solves this:
(1) Here is my attempt to create a pivot table. I use by() to group by one column and then count the frequency of second column factor in a function.
Example:
(2) You can use by and nrow.