Is `Locale` needed for parsing date-time strings i

2019-01-18 07:24发布

问题:

  • Under what conditions do I need a Locale for parsing date-time strings in Java?
  • What does Locale have to do with time zone?

Sometimes I see Questions & Answers where the Locale was needed for the solution to a parsing problem. Yet in others there is no mention of Locale.

回答1:

Locale & Time Zone are unrelated

Locale and time zone are separate, orthogonal issues with respect to date-time handling.

  • Locale
    • Language
      Human language, such as Arabic, French, Farsi. Text of the names of day-of-week, names of month, and ordinal indicators. For example… Is it Monday or Lundi?
    • Culture
      Commonly used idioms in arranging the pieces of text and numbers making up a string representation of a date-time value. For example… In a short form, is month-date-year, date-month-year, or year-month-date? In a long form, does the day of the week come first? Does the month name have an initial cap or all lowercase? Does an abbreviation have a FULL STOP (PERIOD) character or not?
  • Time Zone
    • Offset
      The number of hours and minutes difference between the wall-time used by people in one area from UTC (GMT), the primary time standard by which the world regulates clocks and time.
    • Anomalies
      The history of changes made to the offset, the currently applied rules defining the offset including adjustments such as Daylight Saving Time (DST), and confirmed plans for changes to come in the near-future.

So you can mix-and-match locale and time zone. Some examples follow.

  • A French man attending a conference in Pune India needs to see the sessions schedule in the wall-time of India but would prefer reading "Monday" as "Lundi", his native French.
    • French locale
    • India time zone
  • An Brazilian engineer working in Seattle wants to watch a live webinar broadcast from Turku Finland. She needs to know when to point her web browser to the webinar. She needs to know the start time in Finland after adjusting to Seattle time zone but with a Brazilian locale to display in her native Portuguese.
    • Locale( "pt" , "BR" ) for presentation (for generating textual representations)
    • The scheduled start time in Finland must be adjusted from Europe/Helsinki to America/Los_Angeles (Seattle time zone).
  • A newspaper in Iceland might report an event that happened in Russia as two date-times, the Moscow time zone and for clarity adding UTC time zone. But the article will use Icelandic language for the text including day-of-week.
    • Moscow time zone & Iceland locale
    • UTC time zone & Iceland locale

When parsing/generating a string that is a textual representation of a date-time value, a Locale is used only in two situations:

  • Name of day-of-week, and/or name of month (or ordinal indicator, but best to avoid those)
  • Soft-coded, localized formatting

In the first case, if your string contains words like "Monday"/"Lundi", or "March"/"Mars", then a Locale is used in order to translate those strings.

In the second case, if you do not have an explicit formatting pattern, then a Locale is used in order to know the expected order of the parts of day-of-week, date, name-of-month, year, and so on. For example, where English-speaking Americans say "October 11", French-speaking Canadians use the reverse-order "11 octobre". By soft-coded, we mean something like DateTimeFormatter.ofLocalizedDateTime( FormatStyle.FULL ) versus a hard-coded format like DateTimeFormatter.ofPattern("yyyy MM dd, EEE")

So when is Locale not required? If you have an input string with all digits, such as "2015-01-23", and you are hard-coding the format to be "yyyy-MM-dd"…

String input = "2015-01-23";
DateTimeFormatter formatter = DateTimeFormatter.ofPattern( "yyyy-MM-dd")

…then Locale is effectively irrelevant. You have no words to translate, no "Monday" or "Lundi". And are not asking to use a localized formatter that would need a locale to know if the date comes before or after the month, and other such details.

Note that you can still specify a Locale in such a situation. Indeed, I recommend you make a habit of always specifying Locale (and time zone too).

Implicit Locale & Time Zone

So why do you see so many date-time related Questions and Answers on StackOverflow without any locale? Because if omitted, the JVM’s current default locale is applied automatically and silently.

So if you have a string with English text running on a JVM set to US locale, then no problem. But such dependence on implicit locale is not recommended. If any code in any thread of any app during runtime calls Locale.setDefault, and affect all other code in that JVM. Then your code throws an Exception. Better to make a habit of explicitly specifying the expected/desired locale.

Same advice for time zone. If omitted, the JMV’s current default time zone is applied automatically and silently. Again, any code in any thread of any app during runtime can call TimeZone.setDefault, and affect all other code in that JVM. Then your code throws an Exception or behaves unexpectedly.

Surprise-changes-at-runtime should be reason enough to make a habit of always specifying both locale and time zone. But another benefit is that it also makes your code self-documenting. Also, consciously specifying locale and time zone while programming might alert you to incorrect or uncorroborated assumptions.

Example Scenario

Imagine a businessperson in Québec. She confirms with a customer in Turkey about his delivery, due in the noon hour. So she creates an object using wall-time where the delivery will be accepted, in Turkey.

ZoneId zoneIdIstanbul = ZoneId.of( "Europe/Istanbul" );
ZonedDateTime zdtIstanbul = ZonedDateTime.of( 2015, 10, 11, 12, 30, 00, 0, zoneIdIstanbul );  // Half-past noon in Turkey.

For the customer’s convenience she formats the text using Turkish language and customs. She defines a formatter object to handle the generation of the textual representation of the date-time value. We could also assign a time zone to the formatter to be applied while generating the textual representation. But the ZonedDateTime object already has a time zone assigned, so the formatter will pick up on that time zone.

Locale locale_tr_TR = new Locale( "tr", "TR" );
DateTimeFormatter formatter_tr_TR = DateTimeFormatter.ofLocalizedDateTime( FormatStyle.FULL ).withLocale( locale_tr_TR );
String outputTurkish = formatter_tr_TR.format( zdtIstanbul );

Our businessperson knows that the customer uses a logistics coordinator in Finland, so she prints the same date-time value in Finnish. So we have a Turkey time zone with a Finnish locale.

Locale locale_fi_FI = new Locale( "fi", "FI" );
DateTimeFormatter formatter_fi_FI = DateTimeFormatter.ofLocalizedDateTime( FormatStyle.FULL ).withLocale( locale_fi_FI );
String outputFinnish = formatter_fi_FI.format( zdtIstanbul );

For herself, she needs a string with the expected delivery in her own wall-time so she can set an alarm to remind her to check on successful completion. And she reads French natively, not Turkish.

So this next code is different in that we need to adjust the time zone and the locale. Same moment in the timeline, the date-time of the expected delivery, but represented differently in text. Notice how this time we add an extra call to withZone on the end of the chain creating a formatter, where we specify a time zone adjustment to override the ZonedDateTime object’s assigned zone.

Locale locale_fr_CA = Locale.CANADA_FRENCH;
ZoneId zoneId_Montréal = ZoneId.of( "America/Montreal" );
DateTimeFormatter formatter_fr_CA_Adjusted = DateTimeFormatter.ofLocalizedDateTime( FormatStyle.FULL ).withLocale( locale_fr_CA ).withZone( zoneId_Montréal );
String outputQuébec = formatter_fr_CA_Adjusted.format( zdtIstanbul );

Lastly, for the sake of our English-speaking readers of StackOverflow.com, let's do a version in English. But note that we recycle the Québec formatter, keeping the already-set time zone but replacing the locale to that of United States. (Technically not recycling, but so to speak. Use of immutable objects means a new object is instantiated with values based on the old object.)

Locale locale_en_US = Locale.US;
DateTimeFormatter formatter_US_Unadjusted = formatter_fr_CA_Adjusted.withLocale( locale_en_US );
String output_US_Unadjusted = formatter_US_Unadjusted.format( zdtIstanbul );

Let's see the output of these values. Dump to console.

First we implicitly invoke the toString method on our ZonedDateTime object. This method by default uses one of the standard formats defined by ISO 8601. But java.time extends that format by appending the name of the time zone in square brackets, [Europe/Istanbul]. When exchanging data, use these unambiguous standard formats rather than any of the more human-friendly formats.

System.out.println( "zdtIstanbul : " + zdtIstanbul );
System.out.println( "outputTurkish : " + outputTurkish );
System.out.println( "outputFinnish : " + outputFinnish );
System.out.println( "outputQuébec : " + outputQuébec );
System.out.println( "output_US_Unadjusted : " + output_US_Unadjusted );

The output informs us that a lunchtime delivery in Turkey means a 5:30 AM alarm for our woman in Québec.

zdtIstanbul : 2015-10-11T12:30+03:00[Europe/Istanbul]
outputTurkish : 11 Ekim 2015 Pazar 12:30:00 EEST
outputFinnish : sunnuntai, 11. lokakuuta 2015 12.30.00 EEST
outputQuébec : dimanche 11 octobre 2015 5 h 30 EDT
output_US_Unadjusted : Sunday, October 11, 2015 5:30:00 AM EDT