Check if string has date, any format

2019-01-04 11:05发布

How do I check if a string can be parsed to a date?

  • Jan 19, 1990
  • January 19, 1990
  • Jan 19,1990
  • 01/19/1990
  • 01/19/90
  • 1990
  • Jan 1990
  • January1990

These are all valid dates. If there's any concern regarding the lack of space in between stuff in item #3 and the last item above, that can be easily remedied via automatically inserting a space in between letters/characters and numbers, if so needed.

But first, the basics:

I tried putting it in an if statement:

if datetime.strptime(item, '%Y') or datetime.strptime(item, '%b %d %y') or datetime.strptime(item, '%b %d %Y')  or datetime.strptime(item, '%B %d %y') or datetime.strptime(item, '%B %d %Y'):

But that's in a try-except block, and keeps returning something like this:

16343 time data 'JUNE1890' does not match format '%Y'

Unless, it met the first condition in the if statement.

To clarify, I don't actually need the value of the date - I just want to know if it is. Ideally, it would've been something like this:

if item is date:
    print date
else:
    print "Not a date"

Is there any way to do this?

2条回答
孤傲高冷的网名
2楼-- · 2019-01-04 11:12

Have a look at the parse function in dateutils.parser. It's capable of parsing almost any string to a datetime object.

If you simply want to know whether a particular string could represent a date, you could try the following function:

from dateutil.parser import parse

def is_date(string):
    try: 
        parse(string)
        return True
    except ValueError:
        return False

Then you have:

>>> is_date("1990-12-1")
True
>>> is_date("xyznotadate")
False

One note of caution: parse might recognise some strings as dates which you don't want to treat as dates, e.g. "23, 4" will be parsed as datetime.datetime(2023, 4, 16, 0, 0). You might need additional checks if you want to catch these cases.

查看更多
女痞
3楼-- · 2019-01-04 11:23

If you want to parse those particular formats, you can just match against a list of formats:

txt='''\
Jan 19, 1990
January 19, 1990
Jan 19,1990
01/19/1990
01/19/90
1990
Jan 1990
January1990'''

import datetime as dt

fmts = ('%Y','%b %d, %Y','%b %d, %Y','%B %d, %Y','%B %d %Y','%m/%d/%Y','%m/%d/%y','%b %Y','%B%Y','%b %d,%Y')

parsed=[]
for e in txt.splitlines():
    for fmt in fmts:
        try:
           t = dt.datetime.strptime(e, fmt)
           parsed.append((e, fmt, t)) 
           break
        except ValueError as err:
           pass

# check that all the cases are handled        
success={t[0] for t in parsed}
for e in txt.splitlines():
    if e not in success:
        print e    

for t in parsed:
    print '"{:20}" => "{:20}" => {}'.format(*t) 

Prints:

"Jan 19, 1990        " => "%b %d, %Y           " => 1990-01-19 00:00:00
"January 19, 1990    " => "%B %d, %Y           " => 1990-01-19 00:00:00
"Jan 19,1990         " => "%b %d,%Y            " => 1990-01-19 00:00:00
"01/19/1990          " => "%m/%d/%Y            " => 1990-01-19 00:00:00
"01/19/90            " => "%m/%d/%y            " => 1990-01-19 00:00:00
"1990                " => "%Y                  " => 1990-01-01 00:00:00
"Jan 1990            " => "%b %Y               " => 1990-01-01 00:00:00
"January1990         " => "%B%Y                " => 1990-01-01 00:00:00
查看更多
登录 后发表回答