Plot as bitmap in PDF

2020-02-10 07:29发布

问题:

I am currently working on CGH array results, which involve several plots of dozens of thousands of points, and i would like to benefit from the multiple page feature of the PDF device and the lightness of the PNG image format.

The problem is that the PDF device stores the plots as vectorial drawings, so the PDF files are huge and take several minutes to open. I wonder if R can plot as multiple bitmaps embedded in a single PDF file, as i know the PDF format able to handle it.

Here is a simple example, the PDF file is about 2 Mo while the png ones are about 10 Ko, so I'd like a PDF file of about 20 Ko.

png("test%i.png")
plot(rnorm(2e4), rnorm(2e4), pch="+", cex=0.6)
plot(rnorm(2e4), rnorm(2e4), pch="+", cex=0.6)
dev.off()

pdf("test.pdf", onefile=TRUE)
plot(rnorm(2e4), rnorm(2e4), pch="+", cex=0.6)
plot(rnorm(2e4), rnorm(2e4), pch="+", cex=0.6)
dev.off()

回答1:

Here's a solution that gets you close (50kb) to your desired file size (25kb), without requiring you to install LaTeX and/or learn Sweave. (Not that either of those are undesirable in the long-run!)

It uses the grid functions grid.cap() and grid.raster(). More details and ideas are in a recent R-Journal article by Paul Murrell (warning : PDF):

require(grid)
# Make the plots
dev.new()  # Reducing width and height of this device will produce smaller raster files
    plot(rnorm(2e4), rnorm(2e4), pch="+", cex=0.6)
    cap1 <- grid.cap()
    plot(rnorm(2e4), rnorm(2e4), pch="+", cex=0.6, col="red")
    cap2 <- grid.cap()
dev.off()

# Write them to a pdf
pdf("test.pdf", onefile=TRUE)
     grid.raster(cap1)
     plot.new()
     grid.raster(cap2)
dev.off()

The resulting pdf images appear to retain more detail than your files test1.png and test2.png, so you could get even closer to your goal by trimming down their resolution.



回答2:

Use the png driver to create a PNG file of an acceptable resolution. Make your plot to that. Close the png device.

Then use readPNG from package:png to read it in.

Next open a PDF driver, create a blank plot with no margins and bounds at (0,0) (1,1) and draw the png to that using rasterImage. Add extra pages by creating fresh plots. Close PDF driver.

That should give you a PDF with bitmapped versions of the plots. There's a few tricky bits in getting the plots set up right, and the png resolution is crucial, but I think the above has all the ingredients.

> png("plot.png")
> makeplot(100000) # simple function that plots 100k points 
> dev.off()
X11cairo 
       2 
> plotPNG = readPNG("plot.png")
> pdf("plot.pdf")
> par(mai=c(0,0,0,0))
> plot(c(0,1),c(0,1),type="n")
> rasterImage(plotPNG,0,0,1,1)
> dev.off()

Then check plot.pdf...



回答3:

To include multiple plots in your pdf, set onefile = TRUE.

pdf("test.pdf", onefile = TRUE)
plot(1:5)
plot(6:10)
dev.off()

To make those plots PNGs rather than native PDF plots will require a tiny bit more effort. Create all your plots as PNGs, like so:

png("test%01d.png")
plot(1:5)
plot(6:10)
dev.off()

Then create a LaTeX document that includes those PNGs. You can do that from R by using Sweave (but how to do that is big enough to be its own question). There's a decent introductory example here.



回答4:

How abouta Sweave solution?

\documentclass[a4paper]{article}
\usepackage[OT1]{fontenc}
\usepackage{Sweave}
\SweaveOpts{pdf = FALSE, eps = FALSE}
\DeclareGraphicsExtensions{.png}

\begin{document}

\title{Highly imaginative title}
\author{romunov}

\maketitle

<<fig = TRUE, png = TRUE, echo = FALSE>>=
    plot(1:10, 1:10)
@

\end{document}