Different behavior of WeekFields on JVM 8 and JVM

2020-06-02 08:07发布

问题:

I have really simple program here:

 public static void main(String[] args) {
        LocalDate year = LocalDate.ofYearDay(2022, 100);
        System.out.println(year);

        System.out.println(WeekFields.of(Locale.GERMAN).weekOfYear());

        System.out.println(year.with(WeekFields.of(Locale.GERMAN).weekOfYear(), 0));
        System.out.println(year.with(WeekFields.of(Locale.GERMAN).weekOfYear(), 0).with(TemporalAdjusters.previousOrSame(DayOfWeek.MONDAY)));
    }

But it behaves differently on JVM 8 and JVM 10. The problem seems to be implementation of WeekFields.of(Locale.GERMAN).weekOfYear().

On JVM 10 I get following results:

JVM 10

2022-04-10
WeekOfYear[WeekFields[SUNDAY,1]]
2021-12-19
2021-12-13

whereas on JVM 8:

JVM 8

2022-04-10
WeekOfYear[WeekFields[MONDAY,4]]
2022-01-02
2021-12-27

Why is this happening? Am I doing something, that could potentially cause undefined behavior? Or is this change in behavior somewhere specified?

JVM10:

$ java -version
openjdk version "10.0.2" 2018-07-17
OpenJDK Runtime Environment (build 10.0.2+13-Ubuntu-1ubuntu0.18.04.4)
OpenJDK 64-Bit Server VM (build 10.0.2+13-Ubuntu-1ubuntu0.18.04.4, mixed mode)

JVM8

$ java -version
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.18.04.1-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)

EDIT: JVM 9 has same behavior as JVM 8 and JVM 11 behaves like JVM 10

EDIT 2: I actually found the commit which changed the behavior -> here on github and I'm curious why this was changed.

回答1:

Such week fields are highly localized, and hence dependent on the localized resources of the underlying JVM which can change from one release to another.

I think JVM10 is more correct because Locale.GERMAN does not refer to any country so Java simple assumes US (somehow questionable to handle this country as world standard, but so is Java).

You should better use Locale.GERMANY. That country indeed does use Monday as first day of week (in contrast to US starting on Sunday which is used as fallback for GERMAN which is just a language and not a country.

Update - my research about CLDR data:

The current CLDR data list for the fallback country/territory "001" (= worldwide) the week definitions (Monday as first day of week and 1 = minimal days of first week in calendar year). Astonishingly, this is different from US-definition (Sunday, 1). I think, Oracle has just done its own thing. Personally, I agree with @Holger and rather expect ISO-8601 as fallback (Monday, 4).

However, you might be able to restore the Java-8-behaviour on your JVM-10-machine by setting following system property (not tested):

java.locale.providers=COMPAT,CLDR,SPI


回答2:

How to fix

The following two options are equivalent. Pick the one you find most appropriate for your situation.

  • WeekFields.ISO
  • WeekFields.of(Locale.GERMANY) using the country, Germany, instead of the language, German.

Why is this happening? CLDR and country vs. language

Two differences are in play here:

  1. Different default locale data in different Java versions.
  2. As others have said, the difference between a language-only locale and a locale that includes country.

The definition of week schemes in different locales is part of the locale data. Java can get its locale data from up to four sources. Java included its own locale data from early versions, and these were the default up to Java 8. From Java 8 CLDR (Unicode Common Locale Data Repository) data were included too, and these became the default from Java 9. Which obviously has changed some functionality and broken some old code, as you have experienced. More precisely the defaults are:

  • Java 8: JRE,SPI where JRE refers to Java’s own locale data.
  • Java 9, 10 and 11: CLDR,COMPAT where CLDR is what is says and COMPAT just is the new name for JRE data.

The defaults can be overridden by setting the system property java.locale.providers. So we can get the Java 8 behaviour in Java 9 and later by setting this property to COMPAT,SPI. Conversely we can get the Java 10 behaviour in Java 8 by setting it to CLDR,JRE. So at its base this is not so much a difference between Java versions, only between their defaults.

The change from Java to CLDR data is this: The Java locale data assigned week definitions to language-only locales (like German) based on where the language is mostly spoken. In contrast the CLDR philosophy is that you may speak any language in any country in the world, and you would rather base the choice of week scheme on country than on language. As a consequence locales that don’t specify a country (like German) all use the worldwide default week definition.

Why the worldwide default week definition is “Sunday, 1” in CLDR I don’t understand. As others I would have expected and preferred ISO, the international standard, “Monday, 4”. As I said in a comment I also found a note saying that this should be the case, but it still isn’t (at least not in the CLDR versions used in Java 8 through 11).

Java 9 is special

As you observed, on Java 9 with default locale data you get “Monday 4” from Locale.GERMAN even though CLDR should be the first default. If on the other hand I set java.locale.providers to CLDR alone, I do get “Sunday 1” as in Java 10 and 11.

A possible explanation is that the CLDR version used in Java 9 does not include a week definition for German. So with the default providers, CLDR,COMPAT, Java falls back on COMPAT, which provides “Monday, 4” for German. When I use CLDR alone, it instead falls back on the worldwide base default, “Sunday, 1”. If this explanation is correct (which I cannot guarantee), it would seem that the CLDR data versions used in Java 10 and 11 do include a week definition for German.

Links

Documentation of LocaleServiceProvider with information about locale data providers and specification of default providers:

  • Java 8
  • Java 9
  • Java 10
  • Java 11

CLDR links:

  • CLDR - Unicode Common Locale Data Repository home page
  • Reasons for Decisions: 2012-02-22 for CLDR 21.0.1 per #993 (reopened), restore firstDay for IE (Ireland) to be Sunday mentioning “(thus defaulting to Monday, the default firstDay for 001 = the world)”, which would seem to indicate that Monday should be the default first day of week if no country is specified.


回答3:

The Locale enum distinguishes instances useful for language (like GERMAN) and instances useful for country (like GERMANY). Use the first if you want to set a different lang setting and keep local Locale, on the other hand use the latter to set both time and language settings.