I'm trying to enter some UTF-8 characters into a LaTeX file in TextMate (which says its default encoding is UTF-8), but LaTeX doesn't seem to understand them.
Running cat my_file.tex
shows the characters properly in Terminal. Running ls -al
shows something I've never seen before: an "@" by the file listing:
-rw-r--r--@ 1 me users 2021 Feb 11 18:05 my_file.tex
(And, yes, I'm using \usepackage[utf8]{inputenc}
in the LaTeX.)
I've found iconv
, but that doesn't seem to be able to tell me what the encoding is -- it'll only convert once I figure it out.
Classic 8-bit LaTeX is very restricted in which UTF8 characters it can use; it's highly dependent on the encoding of the font you're using and which glyphs that font has available.
Since you don't give a specific example, it's hard to know exactly where the problem is — whether you're attempting to use a glyph that your font doesn't have or whether you're not using the correct font encoding in the first place.
Here's a minimal example showing how a few UTF8 characters can be used in a LaTeX document:
You may have more luck with the [utf8x] encoding, but be slightly warned that it's no longer supported and has some idiosyncrasies compared with [utf8] (as far as I recall; it's been a while since I've looked at it). But if it does the trick, that's all that matters for you.
Synalyze It! allows to compare text or bytes in all encodings the ICU library offers. Using that feature you usually see immediately which code page makes sense for your data.
The @ sign means the file has extended attributes.
xattr file
shows what attributes it has,xattr -l file
shows the attribute values too (which can be large sometimes — try e.g.xattr /System/Library/Fonts/HelveLTMM
to see an old-style font that exists in the resource fork).The
@
means that the file has extended file attributes associated with it. You can query them using thegetxattr()
function.There's no definite way to detect the encoding of a file. Read this answer, it explains why.
There's a command line tool, enca, that attempts to guess the encoding. You might want to check it out.
aliased somewhere in my bash configuration as
so I just type
On my vanilla OSX Yosemite, it yields more precise results than "file -I":
Typing
file myfile.tex
in a terminal can sometimes tell you the encoding and type of file using a series of algorithms and magic numbers. It's fairly useful but don't rely on it providing concrete or reliable information.A
Localizable.strings
file (found in localised Mac OS X applications) is typically reported to be a UTF-16 C source file.