How do I determine file encoding in OS X?

2020-01-26 12:42发布

I'm trying to enter some UTF-8 characters into a LaTeX file in TextMate (which says its default encoding is UTF-8), but LaTeX doesn't seem to understand them.

Running cat my_file.tex shows the characters properly in Terminal. Running ls -al shows something I've never seen before: an "@" by the file listing:

-rw-r--r--@  1 me      users      2021 Feb 11 18:05 my_file.tex

(And, yes, I'm using \usepackage[utf8]{inputenc} in the LaTeX.)

I've found iconv, but that doesn't seem to be able to tell me what the encoding is -- it'll only convert once I figure it out.

15条回答
我欲成王,谁敢阻挡
2楼-- · 2020-01-26 13:06

A brute-force way to check the encoding might just be to check the file in a hex editor or similar. (or write a program to check) Look at the binary data in the file. The UTF-8 format is fairly easy to recognize. All ASCII characters are single bytes with values below 128 (0x80) Multibyte sequences follow the pattern shown in the wiki article

If you can find a simpler way to get a program to verify the encoding for you, that's obviously a shortcut, but if all else fails, this would do the trick.

查看更多
再贱就再见
3楼-- · 2020-01-26 13:07

Just use:

file -I <filename>

That's it.

查看更多
啃猪蹄的小仙女
4楼-- · 2020-01-26 13:09

In Mac OS X the command file -I (capital i) will give you the proper character set so long as the file you are testing contains characters outside of the basic ASCII range.

For instance if you go into Terminal and use vi to create a file eg. vi test.txt then insert some characters and include an accented character (try ALT-e followed by e) then save the file.

They type file -I text.txt and you should get a result like this:

test.txt: text/plain; charset=utf-8

查看更多
一纸荒年 Trace。
5楼-- · 2020-01-26 13:13

You can also convert from one file type to another using the following command :

iconv -f original_charset -t new_charset originalfile > newfile

e.g.

iconv -f utf-16le -t utf-8 file1.txt > file2.txt
查看更多
手持菜刀,她持情操
6楼-- · 2020-01-26 13:13

Using file command with the --mime-encoding option (e.g. file --mime-encoding some_file.txt) instead of the -I option works on OS X and has the added benefit of omitting the mime type, "text/plain", which you probably don't care about.

查看更多
小情绪 Triste *
7楼-- · 2020-01-26 13:16

You can try loading the file into a firefox window then go to View - Character Encoding. There should be a check mark next to the file's encoding type.

查看更多
登录 后发表回答