I have an Excel spreadsheet. I am trying to capture a line from the Excel sheet that contains a date, then parse the date out with datetime.strptime()
.
Here is the bit of the Excel sheet I'm working with:
and my relevant code:
pattern = re.compile(r'Listing(.+)', re.IGNORECASE)
a = pattern.findall(str(df))
print("a:", a)
new_a = str(a)
datetime_object = datetime.strptime(new_a, '%b %w %Y')
print("date:", datetime_object)
So I capture everything that follows LISTING
and produce:
a: [' JUN 11 2013 Unnamed: 1 \\']
Then I try to extract the Jun
, 11
, and 2013
but I fail with:
ValueError: time data "[' JUN 11 2013 Unnamed: 1 \\\\']" does not match format '%b %w %Y'
I am fairly sure this is a simple fix but being a beginner I can't see how exactly to fix it. Should I alter my RegEx to capture less? Or should I fix the arguments that date.strptime()
is taking in?
The arguments seem to be right when looking at the documentation: https://docs.python.org/3.5/library/datetime.html
Thanks for any help.
You need to modify the regex you're using to get the date from the Excel file.
pattern = re.compile(r'Listing ([A-Z]+ \d{1,2} \d{4})', re.IGNORECASE)
[A-Z]+
means "one or more capital letters",\d{1,2}
means "one or two numbers" and\d{4}
means "four numbers".Furthermore the format of date you're using is incorrect -
%w
means weekday (numbers from 0 to 6 representing weekdays from Sunday to Saturday), while you should use%d
which matches day of the monthSo it should look like this in the end:
datetime_object = datetime.strptime(new_a, '%b %d %Y')