When slicing a dataframe using loc,
df.loc[start:end]
both start and end are included. Is there an easy way to exclude the end when using loc?
When slicing a dataframe using loc,
df.loc[start:end]
both start and end are included. Is there an easy way to exclude the end when using loc?
loc
includes both the start and end, one less ideal work around is to get the index position and use iloc
to slice the data frame (assume you don't have duplicated index):
df=pd.DataFrame({'A':[1,2,3,4]}, index = ['a','b','c','d'])
df.iloc[df.index.get_loc('a'):df.index.get_loc('c')]
# A
#a 1
#b 2
df.loc['a':'c']
# A
#a 1
#b 2
#c 3
Easiest I can think of is df.loc[start:end].iloc[:-1]
.
Chops off the last one.
None of the answers addresses the situation where end
is not part of the index.
The more general solution is simply comparing the index to start
and end
, that way you can enforce either of them being inclusive of exclusive.
df[(df.index >= start) & (df.index < end)]
For instance:
>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame(
{
"x": np.arange(48),
"y": np.arange(48) * 2,
},
index=pd.date_range("2020-01-01 00:00:00", freq="1H", periods=48)
)
>>> start = "2020-01-01 14:00"
>>> end = "2020-01-01 19:30" # this is not in the index
>>> df[(df.index >= start) & (df.index < end)]
x y
2020-01-01 14:00:00 14 28
2020-01-01 15:00:00 15 30
2020-01-01 16:00:00 16 32
2020-01-01 17:00:00 17 34
2020-01-01 18:00:00 18 36
2020-01-01 19:00:00 19 38