Suppose I have a DataFrame with some NaN
s:
>>> import pandas as pd
>>> df = pd.DataFrame([[1, 2, 3], [4, None, None], [None, None, 9]])
>>> df
0 1 2
0 1 2 3
1 4 NaN NaN
2 NaN NaN 9
What I need to do is replace every NaN
with the first non-NaN
value in the same column above it. It is assumed that the first row will never contain a NaN
. So for the previous example the result would be
0 1 2
0 1 2 3
1 4 2 3
2 4 2 9
I can just loop through the whole DataFrame column-by-column, element-by-element and set the values directly, but is there an easy (optimally a loop-free) way of achieving this?
In my case, we have time series from different devices but some devices could not send any value during some period. So we should create NA values for every device and time period and after that do fillna.
Result:
You could use the
fillna
method on the DataFrame and specify the method asffill
(forward fill):This method...
To go the opposite way, there's also a
bfill
method.This method doesn't modify the DataFrame inplace - you'll need to rebind the returned DataFrame to a variable or else specify
inplace=True
:Only one column version
You can use
pandas.DataFrame.fillna
with themethod='ffill'
option.'ffill'
stands for 'forward fill' and will propagate last valid observation forward. The alternative is'bfill'
which works the same way, but backwards.There is also a direct synonym function for this,
pandas.DataFrame.ffill
, to make things simpler.The accepted answer is perfect. I had a related but slightly different situation where I had to fill in forward but only within groups. In case someone has the same need, know that fillna works on a DataFrameGroupBy object.
One thing that I noticed when trying this solution is that if you have N/A at the start or the end of the array, ffill and bfill don't quite work. You need both.