How Do I Parse a Date Time String That Includes Fr

2019-01-28 04:42发布

问题:

I have a date time string:

20:48:01.469 UTC MAR 31 2016

I would like to convert this string representation of time to a struct tm using strptime, but my format string isn't working.

Is there a format specifier for fractional seconds? Perhaps %S, %s, or something else?

Code snippet is below:

tm tmbuf;
const char *str = "20:48:01.469 UTC MAR 31 2016"
const char *fmt = "%H:%M:%s %Z %b %d %Y";
strptime(str,fmt,&tmbuf);

回答1:

Using this free, open source C++11/14 library, here is another way to deal with parsing fractional seconds:

#include "tz.h"
#include <iostream>
#include <sstream>

int main()
{
    using namespace date;
    using namespace std::chrono;
    std::istringstream str("20:48:01.469 UTC MAR 31 2016");
    sys_time<milliseconds> tp;
    parse(str, "%T %Z %b %d %Y", tp);
    std::cout << tp << '\n';
}

Output:

2016-03-31 20:48:01.469

I.e., with this tool %S and %T just work. The precision is controlled not with flags, but with the precision of the std::chrono::time_point.

If you want to find out what timezone abbreviation you parsed, that is also possible:

std::istringstream str("20:48:01.469 UTC MAR 31 2016");
sys_time<milliseconds> tp;
std::string abbrev;
parse(str, "%T %Z %b %d %Y", tp, abbrev);
std::cout << tp << ' ' << abbrev << '\n';

Output:

2016-03-31 20:48:01.469 UTC

This being said, this library is built on top of std::get_time and thus has the same portability problem that Jonathan's excellent (and upvoted) answer alludes to: Only libc++ currently parses month names in a case-insensitive manner. Hopefully that will change in the not-too-distant future.

libstdc++ bug report.

VSO#232129 bug report.

If you have to deal with timezones other than UTC, in general, there is no sure-fire method to do that, because at any one time, more than one timezone can be using the same abbreviation. So the UTC offset can be ambiguous. However here is a short article on how to use this library to narrow down an abbreviation to a list of candidate timezones from which you might have some ad hoc logic for choosing a unique timezone.



回答2:

Note that tm's member denoting the smallest time increment is tm_sec, which is an int which is defined only over the range:

Seconds after the minute [0,60] since C++11

So you won't be able to store a fraction of a second in a tm, you'll just need to discard the number following the decimal place.

As suggested by Karsten Koop you can just read the year twice, the second %Y will stomp the first:

auto fmt = "%H:%M:%S.%Y %Z %b %d %Y";

Live Example


That said, I'd recommend against using strptime it is a POSIX function, using a standard function like get_time would be preferable. This has one minor drawback; get_time doesn't have knowledge of time zones, but then neither does tm, with the exception of tm_isdst which is the:

Daylight Saving Time flag. The value is positive if DST is in effect, zero if not and negative if no information is available

So you may have to assign tm_isdst independently if you persue something like this:

tm tmbuf;
stringstream str("20:48:01.469 UTC MAR 31 2016");

str >> get_time(&tmbuf, "%H:%M:%S.%Y UTC %b %d %Y");

Live Example


My get_time answer was a little hypocritical, cause while I speak of the importance of standardization I could only get it to run on libc++. As such I thought I'd post a more universal solution, which will also discard the time zone, so again you'll need to set tm_isdst independently:

tm tmbuf{};
stringstream str("20:48:01.469 UTC MAR 31 2016");
string tm_mon;

str >> get_time(&tmbuf, "%T");

str.ignore(std::numeric_limits<std::streamsize>::max(), 'C');

str >> tm_mon >> get_time(&tmbuf, "%d %Y");

for (const auto& i : { "JAN"s, "FEB"s, "MAR"s, "APR"s, "MAY"s, "JUN"s, "JUL"s, "AUG"s, "SEP"s, "OCT"s, "NOV"s, "DEC"s }) {
    if (equal(cbegin(tm_mon), cend(tm_mon), cbegin(i), cend(i), [](const unsigned char a, const unsigned char b) { return toupper(a) == b; })) break;
    ++tmbuf.tm_mon;
}

Live Example

This has 2 key dependencies:

  1. That the timezone always ends in the character 'C' (it must be uppercase)
  2. That the month abbreviations fed in match one of those in my initializer_list