I need to parse (German) dates that come in the following form:
10. Jan. 18:14
8. Feb. 19:02
1. Mär. 19:40
4. Apr. 18:55
2. Mai 21:55
5. Juni 08:25
5. Juli 20:09
1. Aug. 13:42
[...]
As you can see, the month names are cut if the month has more than 4 characters. Even weirder, don't aks me why, the month of March is shortened to Mär.
although the whole name is März
. How can I parse this with java.time
?
(The dates are formatted based on the localization of the android device that creates the list of dates. However, I'm not parsing it on Android)
My approach was to create a DateTimeFormatter
like this:
DateTimeFormatter.ofPattern("d. MMMM HH:mm").withLocale(Locale.GERMAN);
// or
DateTimeFormatter.ofPattern("d. MMMMM HH:mm").withLocale(Locale.GERMAN);
But neither the MMMM
nor the MMMMM
pattern fit the dates that are shortened. I can, of course, have the following pattern d. MMM. HH:mm
to match the shortened months, but then I can't match the 3 and 4 characters months. I am aware that I can have two formatters (MMM. and MMMMM
) but I would rather have a solution where I have only one formatter and possibly a custom locale or something like this.
The answer to the problem is the DateTimeFormatterBuilder
class and the appendText(TemporalField, Map)
method. It allows any text to be associated with a value when formatting or parsing, which solves the problem effectively and elegantly:
Map<Long, String> monthNameMap = new HashMap<>();
map.put(1L, "Jan.");
map.put(2L, "Feb.");
map.put(3L, "Mar.");
DateTimeFormatter fmt = new DateTimeFormatterBuilder()
.appendPattern("d. ")
.appendText(ChronoField.MONTH_OF_YEAR, monthNameMap)
.appendPattern(" HH:mm")
.parseDefaulting(ChronoField.YEAR, 2016)
.toFormatter();
System.out.println(LocalDateTime.parse("10. Jan. 18:14", fmt));
System.out.println(LocalDateTime.parse("8. Feb. 19:02", fmt));
Some notes:
- The
monthNameMap
must be populated with all 12 months
- The formatter should normally be assigned to a static final constant, rather than being created all the time
- The
parseDefaulting(YEAR, 2016)
has been added so that LocalDateTime.parse(String, DateTimeFormatter)
can be used directly. Without it, there would be no year, and thus nothing more than a TemporalAccessor
could be parsed (the year must be a leap year, in case 29th Feb is being parsed)
You could use a DateTimeFormatterBuilder
:
private static final DateTimeFormatter formatter = new DateTimeFormatterBuilder()
.appendOptional(DateTimeFormatter.ofPattern("d. MMM. HH:ss"))
.appendOptional(DateTimeFormatter.ofPattern("d. MMMM HH:ss"))
.toFormatter(Locale.GERMAN);
Running it on this:
Stream.of(("10. Jan. 18:14\n" +
"8. Feb. 19:02\n" +
"1. Mär. 19:40\n" +
"4. Apr. 18:55\n" +
"2. Mai 21:55\n" +
"5. Juni 08:25\n" +
"5. Juli 20:09\n" +
"1. Aug. 13:42").split("\n"))
.map(formatter::parse)
.forEach(System.out::println);
you get:
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=10, MonthOfYear=1, MilliOfSecond=0, SecondOfMinute=14, HourOfDay=18},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=8, MonthOfYear=2, MilliOfSecond=0, SecondOfMinute=2, HourOfDay=19},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=1, MonthOfYear=3, MilliOfSecond=0, SecondOfMinute=40, HourOfDay=19},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=4, MonthOfYear=4, MilliOfSecond=0, SecondOfMinute=55, HourOfDay=18},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=2, MonthOfYear=5, MilliOfSecond=0, SecondOfMinute=55, HourOfDay=21},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=5, MonthOfYear=6, MilliOfSecond=0, SecondOfMinute=25, HourOfDay=8},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=5, MonthOfYear=7, MilliOfSecond=0, SecondOfMinute=9, HourOfDay=20},ISO
{NanoOfSecond=0, MicroOfSecond=0, DayOfMonth=1, MonthOfYear=8, MilliOfSecond=0, SecondOfMinute=42, HourOfDay=13},ISO
As pointed out it would be easier to use a standard and consistent format - here you are mixing long and short month names.
One option (short of using a DateTimeFormatterBuilder
) is to handle both cases separately:
private static final DateTimeFormatter SHORT_MONTH = DateTimeFormatter.ofPattern("d. MMM. HH:ss", Locale.GERMAN);
private static final DateTimeFormatter LONG_MONTH = DateTimeFormatter.ofPattern("d. MMMM HH:ss", Locale.GERMAN);
private static TemporalAccessor parse(String s) {
try {
return SHORT_MONTH.parse(s);
} catch (DateTimeParseException e) {
return LONG_MONTH.parse(s);
}
}
You can regex replace the month portion so it's always 3 characters length before parsing it using "d. MMM HH:mm"
text = text.replaceFirst("(\\S+\\s\\S{3})\\S", "$1")
Explanation for the regex part: Find 1 or more non-whitespace (\S+) followed by 1 whitespace (\s) followed by three non-whitespace (\S{3}) followed by one non-whitespace, and replace it with the portion inside first bracket ($1)
10. Jan. 18:14
will become 10. Jan 18:14
and
5. Juni 08:25
will become 5. Jun 08:25