I'm trying to enter some UTF-8 characters into a LaTeX file in TextMate (which says its default encoding is UTF-8), but LaTeX doesn't seem to understand them.
Running cat my_file.tex
shows the characters properly in Terminal. Running ls -al
shows something I've never seen before: an "@" by the file listing:
-rw-r--r--@ 1 me users 2021 Feb 11 18:05 my_file.tex
(And, yes, I'm using \usepackage[utf8]{inputenc}
in the LaTeX.)
I've found iconv
, but that doesn't seem to be able to tell me what the encoding is -- it'll only convert once I figure it out.
A brute-force way to check the encoding might just be to check the file in a hex editor or similar. (or write a program to check) Look at the binary data in the file. The UTF-8 format is fairly easy to recognize. All ASCII characters are single bytes with values below 128 (0x80) Multibyte sequences follow the pattern shown in the wiki article
If you can find a simpler way to get a program to verify the encoding for you, that's obviously a shortcut, but if all else fails, this would do the trick.
Just use:
That's it.
In Mac OS X the command
file -I
(capital i) will give you the proper character set so long as the file you are testing contains characters outside of the basic ASCII range.For instance if you go into Terminal and use vi to create a file eg.
vi test.txt
then insert some characters and include an accented character (try ALT-e followed by e) then save the file.They type
file -I text.txt
and you should get a result like this:test.txt: text/plain; charset=utf-8
You can also convert from one file type to another using the following command :
e.g.
Using
file
command with the--mime-encoding
option (e.g.file --mime-encoding some_file.txt
) instead of the -I option works on OS X and has the added benefit of omitting the mime type, "text/plain", which you probably don't care about.You can try loading the file into a firefox window then go to View - Character Encoding. There should be a check mark next to the file's encoding type.