How do I detect the user's locale to get the c

2019-02-23 03:26发布

问题:

I have a simple data conversion tool and one of the outputs it can produce is a csv file.

This works perfectly here in the UK but when I shipped it out to a German customer I had some issues. Specifally, they use a ',' to represent the decimal point in a floating point number and vice versa. This means that when they open their data file in excel, the result is rather messy to say the least :-)

Substituting the correct character is trivial, but how can I detect whether or not to apply this?

Edit:

So this:

a,b,c
1.1,1.2,1.3
"1.1",1,2,"1,3"
"this,is,multi-
-line",this should be column 2, row 4
a;b;c
"a;b","c"

..looks like this when loaded into excel in the UK:

+----------------+-----+-----+-----+
| a              | b   | c   |     |
+----------------+-----+-----+-----+
| 1.1            | 1.2 | 1.3 |     |
+----------------+-----+-----+-----+
| 1.1            | 1   | 2   | 1,3 |
+----------------+-----+-----+-----+
| this,is,multi- |     |     |     |
| -line          | 2   | 4   |     |
+----------------+-----+-----+-----+
| a;b;c          |     |     |     |
+----------------+-----+-----+-----+
| a;b            | c   |     |     |
+----------------+-----+-----+-----+

..but what happens in Germany?

回答1:

CSV files as the name suggest should be comma-seperated and are not local dependant. However what you could do to avoid this issue is double-quote the relevant decimal numbers within the CSV file as such: "10,20", "1,50", "This is another column". This should avoid the issue entirely for any decent CSV-parser (such as the FileHelpers library) which will read this as 10,20 and 1,50 and not as: 10, 20, 1, and 50.

See CSV:

More sophisticated CSV implementations permit commas and other special characters in a field value. Many implementations use " (double quote) characters around values that contain reserved characters (such as commas, double quotes, or newlines); embedded double quote characters may be represented by a pair of consecutive double quotes



回答2:

Use:

System.Globalization.CultureInfo.CurrentCulture.TextInfo.ListSeparator

Writing CSV: The "List separator" string should be used as the delimiters in CSV (see below on how to change this variable). Changing the value of the "List separator" is also reflected in Excel when saving as CSV.

Reading CSV: Determining the delimiter in CSV is another story and it is a bit more complex. In principle it is possible to use a "," as a CSV delimiter in one system and use a ";" or even a "*" or any ("string") as a delimiter on another system: This article provides some insights on how to detect CSV delimiters where reading cross-systems CSV files:

http://www.codeproject.com/Articles/231582/Auto-detect-CSV-separator.

Also you can perform some tests on your exporter by changing the "List separator" value in Windows as follows (might differ between with each Windows OS):

  • Open Region and Language dialog.
  • Select on the "Format" tab.
  • Click on "Additional Settings"
  • Edit the value of the "List separator"


回答3:

As others have mentioned CSV in general should be comma-separated and fields should be double-quoted. However there is also MS Excel specific behavior that causes a correct CSV file to be imported incorrectly. That is because MS Excel by default uses list separator set in Windows System in 'Regional and language options'. For US/UK locale it is comma but for such languages as German it is semicolon. So for MS Excel the option is to use different separator per locale.



回答4:

The CurrencyDecimalSeparator property contains the decimal separator for the given culture. This being said the CSV separator is not culture dependent. It is a property of the CSV file which you indicate to the parser. And talking about parsers I sincerely hope that you are not rolling your own CSV parser.



回答5:

As others recommended already, the format should not be locale sensitive. This is true for storage (in files like CSV or other formats) or communication protocols. You should worry about locale sensitivity for the presentation layer only. Otherwise it means that a file saved by an American user (for instance) cannot be loaded by a German one (and the other way around).

See here for more complete guidelines: http://mihai-nita.net/2005/10/25/data-internationalization/