get UTC timestamp in python with datetime

2020-01-30 03:45发布

问题:

Is there a way to get the UTC timestamp by specifying the date? What I would expect:

datetime(2008, 1, 1, 0, 0, 0, 0)

should result in

 1199145600

Creating a naive datetime object means that there is no time zone information. If I look at the documentation for datetime.utcfromtimestamp, creating a UTC timestamp means leaving out the time zone information. So I would guess, that creating a naive datetime object (like I did) would result in a UTC timestamp. However:

then = datetime(2008, 1, 1, 0, 0, 0, 0)
datetime.utcfromtimestamp(float(then.strftime('%s')))

results in

2007-12-31 23:00:00

Is there still any hidden time zone information in the datetime object? What am I doing wrong?

回答1:

Naïve datetime versus aware datetime

Default datetime objects are said to be "naïve": they keep time information without the time zone information. Think about naïve datetime as a relative number (ie: +4) without a clear origin (in fact your origin will be common throughout your system boundary).

In contrast, think about aware datetime as absolute numbers (ie: 8) with a common origin for the whole world.

Without timezone information you cannot convert the "naive" datetime towards any non-naive time representation (where does +4 targets if we don't know from where to start ?). This is why you can't have a datetime.datetime.toutctimestamp() method. (cf: http://bugs.python.org/issue1457227)

To check if your datetime dt is naïve, check dt.tzinfo, if None, then it's naïve:

datetime.now()        ## DANGER: returns naïve datetime pointing on local time
datetime(1970, 1, 1)  ## returns naïve datetime pointing on user given time

I have naïve datetimes, what can I do ?

You must make an assumption depending on your particular context: The question you must ask yourself is: was your datetime on UTC ? or was it local time ?

  • If you were using UTC (you are out of trouble):

    import calendar
    
    def dt2ts(dt):
        """Converts a datetime object to UTC timestamp
    
        naive datetime will be considered UTC.
    
        """
    
        return calendar.timegm(dt.utctimetuple())
    
  • If you were NOT using UTC, welcome to hell.

    You have to make your datetime non-naïve prior to using the former function, by giving them back their intended timezone.

    You'll need the name of the timezone and the information about if DST was in effect when producing the target naïve datetime (the last info about DST is required for cornercases):

    import pytz     ## pip install pytz
    
    mytz = pytz.timezone('Europe/Amsterdam')             ## Set your timezone
    
    dt = mytz.normalize(mytz.localize(dt, is_dst=True))  ## Set is_dst accordingly
    

    Consequences of not providing is_dst:

    Not using is_dst will generate incorrect time (and UTC timestamp) if target datetime was produced while a backward DST was put in place (for instance changing DST time by removing one hour).

    Providing incorrect is_dst will of course generate incorrect time (and UTC timestamp) only on DST overlap or holes. And, when providing also incorrect time, occuring in "holes" (time that never existed due to forward shifting DST), is_dst will give an interpretation of how to consider this bogus time, and this is the only case where .normalize(..) will actually do something here, as it'll then translate it as an actual valid time (changing the datetime AND the DST object if required). Note that .normalize() is not required for having a correct UTC timestamp at the end, but is probably recommended if you dislike the idea of having bogus times in your variables, especially if you re-use this variable elsewhere.

    and AVOID USING THE FOLLOWING: (cf: Datetime Timezone conversion using pytz)

    dt = dt.replace(tzinfo=timezone('Europe/Amsterdam'))  ## BAD !!
    

    Why? because .replace() replaces blindly the tzinfo without taking into account the target time and will choose a bad DST object. Whereas .localize() uses the target time and your is_dst hint to select the right DST object.

OLD incorrect answer (thanks @J.F.Sebastien for bringing this up):

Hopefully, it is quite easy to guess the timezone (your local origin) when you create your naive datetime object as it is related to the system configuration that you would hopefully NOT change between the naive datetime object creation and the moment when you want to get the UTC timestamp. This trick can be used to give an imperfect question.

By using time.mktime we can create an utc_mktime:

def utc_mktime(utc_tuple):
    """Returns number of seconds elapsed since epoch

    Note that no timezone are taken into consideration.

    utc tuple must be: (year, month, day, hour, minute, second)

    """

    if len(utc_tuple) == 6:
        utc_tuple += (0, 0, 0)
    return time.mktime(utc_tuple) - time.mktime((1970, 1, 1, 0, 0, 0, 0, 0, 0))

def datetime_to_timestamp(dt):
    """Converts a datetime object to UTC timestamp"""

    return int(utc_mktime(dt.timetuple()))

You must make sure that your datetime object is created on the same timezone than the one that has created your datetime.

This last solution is incorrect because it makes the assumption that the UTC offset from now is the same than the UTC offset from EPOCH. Which is not the case for a lot of timezones (in specific moment of the year for the Daylight Saving Time (DST) offsets).



回答2:

Another possibility is:

d = datetime.datetime.utcnow()
epoch = datetime.datetime(1970,1,1)
t = (d - epoch).total_seconds()

This works as both "d" and "epoch" are naive datetimes, making the "-" operator valid, and returning an interval. total_seconds() turns the interval into seconds. Note that total_seconds() returns a float, even d.microsecond == 0



回答3:

Also note the calendar.timegm() function as described by this blog entry:

import calendar
calendar.timegm(utc_timetuple)

The output should agree with the solution of vaab.



回答4:

If input datetime object is in UTC:

>>> dt = datetime(2008, 1, 1, 0, 0, 0, 0)
>>> timestamp = (dt - datetime(1970, 1, 1)).total_seconds()
1199145600.0

Note: it returns float i.e., microseconds are represented as fractions of a second.

If input date object is in UTC:

>>> from datetime import date
>>> utc_date = date(2008, 1, 1)
>>> timestamp = (utc_date.toordinal() - date(1970, 1, 1).toordinal()) * 24*60*60
1199145600

See more details at Converting datetime.date to UTC timestamp in Python.



回答5:

I feel like the main answer is still not so clear, and it's worth taking the time to understand time and timezones.

The most important thing to understand when dealing with time is that time is relative!

  • 2017-08-30 13:23:00: (a naive datetime), represents a local time somewhere in the world, but note that 2017-08-30 13:23:00 in London is NOT THE SAME TIME as 2017-08-30 13:23:00 in San Francisco.

Because the same time string can be interpreted as different points-in-time depending on where you are in the world, there is a need for an absolute notion of time.

A UTC timestamp is a number in seconds (or milliseconds) from Epoch (defined as 1 January 1970 00:00:00 at GMT timezone +00:00 offset).

Epoch is anchored on the GMT timezone and therefore is an absolute point in time. A UTC timestamp being an offset from an absolute time therefore defines an absolute point in time.

This makes it possible to order events in time.

Without timezone information, time is relative, and cannot be converted to an absolute notion of time without providing some indication of what timezone the naive datetime should be anchored to.

What are the types of time used in computer system?

  • naive datetime: usually for display, in local time (i.e. in the browser) where the OS can provide timezone information to the program.

  • UTC timestamps: A UTC timestamp is an absolute point in time, as mentioned above, but it is anchored in a given timezone, so a UTC timestamp can be converted to a datetime in any timezone, however it does not contain timezone information. What does that mean? That means that 1504119325 corresponds to 2017-08-30T18:55:24Z, or 2017-08-30T17:55:24-0100 or also 2017-08-30T10:55:24-0800. It doesn't tell you where the datetime recorded is from. It's usually used on the server side to record events (logs, etc...) or used to convert a timezone aware datetime to an absolute point in time and compute time differences.

  • ISO-8601 datetime string: The ISO-8601 is a standardized format to record datetime with timezone. (It's in fact several formats, read on here: https://en.wikipedia.org/wiki/ISO_8601) It is used to communicate timezone aware datetime information in a serializable manner between systems.

When to use which? or rather when do you need to care about timezones?

  • If you need in any way to care about time-of-day, you need timezone information. A calendar or alarm needs time-of-day to set a meeting at the correct time of the day for any user in the world. If this data is saved on a server, the server needs to know what timezone the datetime corresponds to.

  • To compute time differences between events coming from different places in the world, UTC timestamp is enough, but you lose the ability to analyze at what time of day events occured (ie. for web analytics, you may want to know when users come to your site in their local time: do you see more users in the morning or the evening? You can't figure that out without time of day information.

Timezone offset in a date string:

Another point that is important, is that timezone offset in a date string is not fixed. That means that because 2017-08-30T10:55:24-0800 says the offset -0800 or 8 hours back, doesn't mean that it will always be!

In the summer it may well be in daylight saving time, and it would be -0700

What that means is that timezone offset (+0100) is not the same as timezone name (Europe/France) or even timezone designation (CET)

America/Los_Angeles timezone is a place in the world, but it turns into PST (Pacific Standard Time) timezone offset notation in the winter, and PDT (Pacific Daylight Time) in the summer.

So, on top of getting the timezone offset from the datestring, you should also get the timezone name to be accurate.

Most packages will be able to convert numeric offsets from daylight saving time to standard time on their own, but that is not necessarily trivial with just offset. For example WAT timezone designation in West Africa, is UTC+0100 just like CET timezone in France, but France observes daylight saving time, while West Africa does not (because they're close to the equator)

So, in short, it's complicated. VERY complicated, and that's why you should not do this yourself, but trust a package that does it for you, and KEEP IT UP TO DATE!



回答6:

There is indeed a problem with using utcfromtimestamp and specifying time zones. A nice example/explanation is available on the following question:

How to specify time zone (UTC) when converting to Unix time? (Python)



回答7:

The accepted answer seems not work for me. My solution:

import time
utc_0 = int(time.mktime(datetime(1970, 01, 01).timetuple()))
def datetime2ts(dt):
    """Converts a datetime object to UTC timestamp"""
    return int(time.mktime(dt.utctimetuple())) - utc_0


回答8:

A simple solution without using external modules:

from datetime import datetime, timezone

dt = datetime(2008, 1, 1, 0, 0, 0, 0)
int(dt.replace(tzinfo=timezone.utc).timestamp())


回答9:

Simplest way:

>>> from datetime import datetime
>>> dt = datetime(2008, 1, 1, 0, 0, 0, 0)
>>> dt.strftime("%s")
'1199163600'

Edit: @Daniel is correct, this would convert it to the machine's timezone. Here is a revised answer:

>>> from datetime import datetime, timezone
>>> epoch = datetime(1970, 1, 1, 0, 0, 0, 0, timezone.utc)
>>> dt = datetime(2008, 1, 1, 0, 0, 0, 0, timezone.utc)
>>> int((dt-epoch).total_seconds())
'1199145600'

In fact, its not even necessary to specify timezone.utc, because the time difference is the same so long as both datetime have the same timezone (or no timezone).

>>> from datetime import datetime
>>> epoch = datetime(1970, 1, 1, 0, 0, 0, 0)
>>> dt = datetime(2008, 1, 1, 0, 0, 0, 0)
>>> int((dt-epoch).total_seconds())
1199145600