I have a package that uses pandas Panels to generate MultiIndex pandas DataFrames. However, whenever I use pandas.Panel, I get the following DeprecationError:
DeprecationWarning: Panel is deprecated and will be removed in a future version. The recommended way to represent these types of 3-dimensional data are with a MultiIndex on a DataFrame, via the Panel.to_frame() method. Alternatively, you can use the xarray package http://xarray.pydata.org/en/stable/. Pandas provides a
.to_xarray()
method to help automate this conversion.
However, I can't understand what the first recommendation here is actually recommending in order to create MultiIndex DataFrames. If Panel is going to be removed, how am I going to be able to use Panel.to_frame?
To clarify: I am not asking what deprecation is, or how to convert my Panels to DataFrames. What I am asking is, if I am using pandas.Panel and then pandas.Panel.to_frame in a library to create MultiIndex DataFrames from 3D ndarrays, and Panels are going to be deprecated, then what is the best option for making those DataFrames without using the Panel API?
Eg, if I'm doing the following, with X as a ndarray with shape (N,J,K):
p = pd.Panel(X, items=item_names, major_axis=names0, minor_axis=names1)
df = p.to_frame()
this is clearly no longer a viable future-proof option for DataFrame construction, though it was the recommended method in this question.
Consider the following panel:
If you convert this to a DataFrame, this becomes:
So it takes the major and minor axes as the row MultiIndex, and items as columns. The shape has become (6, 5) which was originally (5, 3, 2). It is up to you where to use the MultiIndex but if you want the exact same shape, you can do the following:
which yields the same DataFrame (use the
names
parameter ofpd.MultiIndex.from_product
if you want to name your indices):Now instead of
pnl['item1 1']
, you usedf['item 1']
(optionallydf['item 1'].unstack()
); instead ofpnl.xs(2015)
you usedf.xs(2015)
and instead ofpnl.xs('US', axis='minor')
, you usedf.xs('US', level=1)
.As you see, this is just a matter of reshaping your initial 3D numpy array to 2D. You add the other (artificial) dimension with the help of MultiIndex.