Some of you may have seen my blog post on this topic, where I wrote the following code after wanting to help a friend produce half-filled circles as points on a graph:
TestUnicode <- function(start="25a0", end="25ff", ...)
{
nstart <- as.hexmode(start)
nend <- as.hexmode(end)
r <- nstart:nend
s <- ceiling(sqrt(length(r)))
par(pty="s")
plot(c(-1,(s)), c(-1,(s)), type="n", xlab="", ylab="",
xaxs="i", yaxs="i")
grid(s+1, s+1, lty=1)
for(i in seq(r)) {
try(points(i%%s, i%/%s, pch=-1*r[i],...))
}
}
TestUnicode(9500,9900)
This works (i.e. produces a nearly-full grid of cool dingbatty symbols):
- on Ubuntu 10.04, in an X11 or PNG device
- on Mandriva Linux distribution, same devices, with locally built R, once pango-devel was installed
It fails to varying degrees (i.e. produces a grid partly or entirely filled with dots or empty rectangles), either silently or with warnings:
- on the same Ubuntu 10.04 machine in PDF or PostScript (tried setting font="NimbusSan" to use URW fonts, doesn't help)
- on MacOS X.6 (quartz, X11, Cairo, PDF)
For example, trying all the available PDF font families:
flist <- c("AvantGarde", "Bookman","Courier", "Helvetica", "Helvetica-Narrow",
"NewCenturySchoolbook", "Palatino", "Times","URWGothic",
"URWBookman", "NimbusMon", "NimbusSan", "NimbusSanCond",
"CenturySch", "URWPalladio","NimbusRom")
for (f in flist) {
fn <- paste("utest_",f,".pdf",sep="")
pdf(fn,family=f)
TestUnicode()
title(main=f)
dev.off()
embedFonts(fn)
}
on Ubuntu, none of these files contains the symbols.
It would be nice to get it to work on as many combinations as possible, but especially in some vector format and double-especially in PDF.
Any suggestions about font/graphics device configurations that would make this work would be welcomed.
I have found the
cairo_pdf
device to be completely insufficient: the output is markedly different from bothpdf
and on-screen rendering, and its plotmath support is sketchy.However, there’s a rather simple workaround on OS X: Use the “normal”
quartz
device and set itstype
topdf
:Unfortunately, on my computer this ignores the font family and always uses Helvetica (although the documentation claims that the default is Arial).
There are at least two other gotchas:
pdf
converts hyphens to minuses. This may not even always be what you want but it’s quite useful to properly typeset negative numbers. The linked thread describes workarounds for this.(I realise that OP briefly mentions the Quartz device but this thread is frequently viewed and I think this solution needs more prominence.)
I think you are out of luck Ben, as, according to some notes by Paul Murrell,
pdf()
can only handle single-byte encodings. Multi-byte encodings need to be converted to a the single-byte equivalent, and therein lies the rub; by definition, single-byte encodings cannot contain all the glyphs that can be represented in a multi-byte encoding like UTF-8, say.Paul's notes can be found here wherein he suggests a couple of solutions using Cairo-based PDF devices, using
cairo_pdf()
on suitably-endowed Linux and Mac OS systems, or via theCairo
package under MS Windows.Have you tried embedding a font in the PDF, or including one for Mac users that would work?
Another solution might be to use tikzDevice which can now use XeLaTeX with Unicode characters. The resulting tex file can then be compiled to produce a pdf. The problem is still that you must have a font on your system that contains the characters.
The first time, this will take a LONG time.