Parsing date and timestamps in Python with time.st

2020-07-10 10:17发布

问题:

My cloud server logs time in this format:

[17/Dec/2011:09:48:49 -0600]

To read it into Python variables, I can say:

>>>str = '17/Dec/2011:09:48:49 -0600'
>>>import time
>>>print  time.strptime(str,"%d/%b/%Y:%H:%M:%S -0600")

Result:

time.struct_time(tm_year=2011, tm-mon=12, tm=mday=17, tm_hour=9, tm_min=48, tm_sec=49, tm_wday=5, tm_yday=351, tm_isdst=-1)

or I can try

>>>mytime = time.strptime(str,"%d/%b/%Y:%H:%M:%S -0600")
>>>print mytime.tm_hour

Result:

9

What does the -0600 do? I expected it to adjust the hour value in the date time object? Is there a wildcard to use instead of hard-coding the -0600?

回答1:

The -0600 is the offset from Greenwich Mean Time (GMT). As another SO question already says, time.strptime cannot read timezone offsets, though datetime.strftime can generate them.

As explained at the start of the datetime module's documentation, there are two ways of approaching "time" in python, naive or aware. When all you care about is time inside your system, dealing with naive time/datetime objects is fine (in which case you can strip out the offset as alan suggested). When you need to compare the values inside your system with the real world's notion of time, you have to start dealing with that offset.

The easy way to deal with this is just to use python-dateutil. It has a parse function that will do its best to fuzzily match the date string you pass in to multiple formats and return a workable datetime instance that represents its best guess as to what you meant.

>>> from dateutil.parser import parse
>>> parse('17/Dec/2011:09:48:49 -0600', fuzzy=True)
datetime.datetime(2011, 12, 17, 9, 48, 49, tzinfo=tzoffset(None, -21600))

Normally, having software give its "best guess" is a Bad Thing. In this case, it seems justified if your input formats are stable. Dealing with time in software development is hard, just go shopping.



回答2:

It's the offset from GMT. If you don't want it, just strip it off:

>>> import time
>>> line = '17/Dec/2011:09:48:49 -0600'
>>> line = line.split(' ')[0]
>>> print  time.strptime(line,"%d/%b/%Y:%H:%M:%S")
time.struct_time(tm_year=2011, tm_mon=12, tm_mday=17, tm_hour=9, tm_min=48,  tm_sec=49, tm_wday=5, tm_yday=351, tm_isdst=-1)


回答3:

strptime simply matches the modifiers (%d,%b, ...) with the corresponding segments of the string and then converts that matched piece of string to an integer. So in your case, the -0600 only makes it so that your format string matches the input string.

If you want to adjust the time by a specified offset, I would recommend using a datetime object.

>>>s = '17/Dec/2011:09:48:49 -0600'
>>>from datetime import datetime,timedelta
>>>mytime = datetime.strptime(s,"%d/%b/%Y:%H:%M:%S -0600")
>>>dt = timedelta(minutes=6*60)  #6 hours
>>>mytime-=dt
>>>print mytime
2011-12-17 03:48:49
>>>print mytime.hour
3

Also note that since str is a builtin, it is generally not advisable to reassign it.