ISO 8601 Time Interval Parsing in Java

2020-01-30 08:55发布

问题:

ISO 8601 defines a syntax for representing a time interval.

There are four ways to express a time interval:

  • Start and end, such as "2007-03-01T13:00:00Z/2008-05-11T15:30:00Z"
  • Start and duration, such as "2007-03-01T13:00:00Z/P1Y2M10DT2H30M"
  • Duration and end, such as "P1Y2M10DT2H30M/2008-05-11T15:30:00Z"
  • Duration only, such as "P1Y2M10DT2H30M", with additional context information

If any elements are missing from the end value, they are assumed to be the same as for the start value including the time zone. This feature of the standard allows for concise representations of time intervals. For example, the date of a two-hour meeting including the start and finish times could be simply shown as "2007-12-14T13:30/15:30", where "/15:30" implies "/2007-12-14T15:30" (the same date as the start), or the beginning and end dates of a monthly billing period as "2008-02-15/03-14", where "/03-14" implies "/2008-03-14" (the same year as the start).

In addition, repeating intervals are formed by adding "R[n]/" to the beginning of an interval expression, where R is used as the letter itself and [n] is replaced by the number of repetitions. Leaving out the value for [n] means an unbounded number of repetitions. So, to repeat the interval of "P1Y2M10DT2H30M" five times starting at "2008-03-01T13:00:00Z", use "R5/2008-03-01T13:00:00Z/P1Y2M10DT2H30M".

I am looking for a good Java parser (if possible compatible with the Joda-Time library) to parse this syntax. Any pointers to a good library ?

回答1:

java.time

The java.time framework built into Java 8 and later has a Duration.parse method for parsing an ISO 8601 formatted duration:

java.time.Duration d = java.time.Duration.parse("PT1H2M34S");
System.out.println("Duration in seconds: " + d.get(java.time.temporal.ChronoUnit.SECONDS));

Prints Duration in seconds: 3754



回答2:

For anyone on a project that might be restricted from using 3rd party libraries (licensing reasons, or whatever), Java itself provides at least a portion of this capability, since Java 1.6 (or earlier?), using the javax.xml.datatype.DatatypeFactory.newDuration(String) method and Duration class. The DatatypeFactory.newDuration(String) method will parse a string in "PnYnMnDTnHnMnS" format. These classes are intended for XML manipulation, but since XML uses ISO 8601 time notation, they also serve as convenient duration parsing utilities.

Example:

import javax.xml.datatype.*;

Duration dur = DatatypeFactory.newInstance().newDuration("PT5H12M36S");
int hours = dur.getHours(); // Should return 5

I haven't personally used any duration format except the 4th one you list, so I can't vouch for whether it successfully parses them or not.



回答3:

I take it you have already tried Joda-Time? Feeding the example strings from your question through Interval.parse(Object) reveals that it can handle "start and end", "start and duration" and "duration and end", but not implied fields nor repetition.

2007-03-01T13:00:00Z/2008-05-11T15:30:00Z => from 2007-03-01T13:00:00.000Z to 2008-05-11T15:30:00.000Z
2007-03-01T13:00:00Z/P1Y2M10DT2H30M       => from 2007-03-01T13:00:00.000Z to 2008-05-11T15:30:00.000Z
P1Y2M10DT2H30M/2008-05-11T15:30:00Z       => from 2007-03-01T13:00:00.000Z to 2008-05-11T15:30:00.000Z
2007-12-14T13:30/15:30                    => java.lang.IllegalArgumentException: Invalid format: "15:30" is malformed at ":30"
R5/2008-03-01T13:00:00Z/P1Y2M10DT2H30M    => java.lang.IllegalArgumentException: Invalid format: "R5"

The only other comprehensive date/time library that I know of is JSR-310, which does not appear to handle intervals like these.

At this point, building your own improvements on top of Joda-Time is probably your best choice, sorry. Are there any specific ISO interval formats that you need to handle beyond those already supported by Joda-Time?



回答4:

The only library which is capable to model all the features of interval parsing you want is actually my library Time4J (range-module). Examples:

// case 1 (start/end)
System.out.println(MomentInterval.parseISO("2012-01-01T14:15Z/2014-06-20T16:00Z"));
// output: [2012-01-01T14:15:00Z/2014-06-20T16:00:00Z)

// case 1 (with some elements missing at end component and different offset)
System.out.println(MomentInterval.parseISO("2012-01-01T14:15Z/08-11T16:00+00:01"));
// output: [2012-01-01T14:15:00Z/2012-08-11T15:59:00Z)

// case 1 (with missing date and offset at end component)
System.out.println(MomentInterval.parseISO("2012-01-01T14:15Z/16:00"));
// output: [2012-01-01T14:15:00Z/2012-01-01T16:00:00Z)

// case 2 (start/duration)
System.out.println(MomentInterval.parseISO("2012-01-01T14:15Z/P2DT1H45M"));
// output: [2012-01-01T14:15:00Z/2012-01-03T16:00:00Z)

// case 3 (duration/end)
System.out.println(MomentInterval.parseISO("P2DT1H45M/2012-01-01T14:15Z"));
// output: [2011-12-30T12:30:00Z/2012-01-01T14:15:00Z)

// case 4 (duration only, in standard ISO-format)
Duration<IsoUnit> isoDuration = Duration.parsePeriod("P2DT1H45M");

// case 4 (duration only, in alternative representation)
Duration<IsoUnit> isoDuration = Duration.parsePeriod("P0000-01-01T15:00");
System.out.println(isoDuration); // output: P1M1DT15H

Some remarks:

  • Other interval classes exist with similar parsing capabilities, for example DateInterval or TimestampInterval in the package net.time4j.range.

  • For handling durations only (which can span both calendar and clock units as well), see also the javadoc. There are also formatting features, see nested class Duration.Formatter or the localized version net.time4j.PrettyTime (actually in 86 languages).

  • Interoperability is offered with Java-8 (java.time-package) but not with Joda-Time. For example: The start or end component of a MomentInterval can easily be queried by getStartAsInstant() or getEndAsInstant().

Repeating intervals are supported by the class IsoRecurrence. Example:

IsoRecurrence<MomentInterval> ir =
    IsoRecurrence.parseMomentIntervals("R5/2008-03-01T13:00:00Z/P1Y2M10DT2H30M");
ir.intervalStream().forEach(System.out::println);

Output:

[2008-03-01T13:00:00Z/2009-05-11T15:30:00Z)
[2009-05-11T15:30:00Z/2010-07-21T18:00:00Z)
[2010-07-21T18:00:00Z/2011-10-01T20:30:00Z)
[2011-10-01T20:30:00Z/2012-12-11T23:00:00Z)
[2012-12-11T23:00:00Z/2014-02-22T01:30:00Z)