\documentclass{article} %% $ R CMD pgfsweave pgfSweave-vignette-source.Rnw % \VignetteIndexEntry{The pgfSweave Package} % \VignetteDepends{pgfSweave} % \VignetteDepends{tikzDevice} \usepackage[nogin,noae]{Sweave} \usepackage[x11names]{xcolor} \usepackage{tikz} \usetikzlibrary{positioning,shapes.geometric,arrows} \usepackage[parfill]{parskip} \usepackage{fancyvrb} \usepackage[margin=1.1in]{geometry} \usepackage[colorlinks]{hyperref} \newcommand{\lang}{\textsf} \newcommand{\code}{\texttt} \newcommand{\pkg}{\textbf} \newcommand{\ques}[1]{\vspace{.5cm}\noindent{\bf\large#1}\vspace{.2cm}} \title{The \pkg{pgfSweave} Package} \author{Cameron Bracken and Charlie Sharpsteen} \begin{document} %% Cache all of the code chunks and generate external figures by default %% the pgfSweave defaults are pdf=FALSE and eps=FALSE and pgf=FALSE and tikz=TRUE. %% to get normal Sweave behavior set pgf=FALSE and external=FALSE \begin{center} {\Large The \pkg{pgfSweave} Package}\\ {\large Cameron Bracken and Charlie Sharpsteen \\ \today}\vspace{1cm} \end{center} The \pkg{pgfSweave} package is about {\color{SteelBlue1}speed} and {\color{Sienna1}style}. For {\color{SteelBlue1}speed}, the package provides capabilities for ``caching'' graphics generated with \pkg{Sweave} on top of the caching funcitonality of \pkg{cacheSweave}\footnote{\url{http://cran.r-project.org/web/packages/cacheSweave/index.html}}. For {\color{Sienna1}style} the \pkg{pgfSweave} package facilitates the integration of \lang{R} graphics with \LaTeX\ reports through the \pkg{tikzDevice}\footnote{\url{http://cran.r-project.org/web/packages/tikzDevice/index.html}} package and \pkg{eps2pgf}\footnote{\url{http://sourceforge.net/projects/eps2pgf/}} utility. With these tools, figure labels are converted to \LaTeX{} strings so they match the style of the document and the full range of \LaTeX{} math symbols/equations are available. The backbone of \pkg{pgfSweave} is a a new driver for \pkg{Sweave} (\code {pgfSweaveDriver}). The driver provides new chunk options \code{tikz}, \code{pgf} and \code {external} on top of the \code{cache} option provided by \pkg{cacheSweave}. This package started as a fork of \pkg{cacheSweave}. This document highlights the features and usage of \pkg{pgfSweave}. This document assumes familiarity with \pkg{Sweave}. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % \section{Motivation and Background} \pkg{Sweave} is a tool for generating ``reproducible research'' documents by embedding \lang{R} or \lang{S} ``code chunks'' directly into a \LaTeX{} document. For small projects, this approach works well. For large papers or projects, heavy data analysis or computation can cause document compilation times that are unacceptable. The problem of performing lengthy computations in Sweave documents is not a new one. Previous attempts to tackle this problem include the \pkg{cacheSweave} and \pkg{weaver}\footnote{\url{http://www.bioconductor.org/packages/2.3/bioc/html/weaver.html}} packages. These packages address the problem that code chunks with lengthy computations are executed every time a document is compiled. Both packages provide a \code{cache} option which saves R objects for quick access during successive compilations. The \pkg{cacheSweave} package stores results in a \pkg{filehash}\footnote{\url{http://cran.r-project.org/package=filehash}} databases while the \pkg{weaver} package stores RData files. The benefit of the \pkg{cacheSweave} method is lazy loading of objects. Both methods provide significant speedup for most \pkg{Sweave} documents, namely those which create objects in the global environment. The existing methods have some drawbacks: \begin{enumerate} \item Plots are not cached (since plots do not generally create objects in the global environment). If a plot takes a long time to generate, the same problem exists as when lengthy computations are present. Ideally we would like to reuse a plot if the code that generated it has not changed. \item Consistency in style (font, point size) in automatically generated graphics is difficult to achieve. The default font and point size in \lang{R} does not match \LaTeX{} very well and getting this to match precisely is tricky business. The previously mentioned tools, \pkg{tikzDevice} and \pkg{eps2pgf}, counter this but using them with \pkg{Sweave} manually can be cumbersome. \end{enumerate} The \pkg{pgfSweave} package addresses these drawbacks. The so called ``caching'' of plots is achieved with the help of three tools: the \TeX{} package \pkg{PGF}\footnote{\url{http://sourceforge.net/projects/pgf/}} and either the command line utility \pkg{eps2pgf} or the \lang{R} package \pkg{tikzDevice}. When we refer to the ``caching'' of a graphic we mean that if the code chunk which generated the graphic is unchanged, an image included from a file rather than regenerated from the code. The \TeX{} package \pkg{pgf} provides the ability to ``externalize graphics.'' The effect of externalization is that graphics get extracted and compiled separately, saving time on subsequent compilations. The externalization chapter in the \pkg {PGF/Ti\textit{k}Z} manual is extremely well written, and we refer the interested user there for more information. Externalization plus some clever checking on the part of \pkg{pgfSweave} makes up the caching mechanism. The plot style consistency drawback is addressed by the handy options \code{tikz} and \code {pgf} which allow for graphics to be output in these formats. Again, it is possible to do this manually but the chunk options make things easier. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{System Requirements} In general \pkg{pgfSweave} depends on: \begin{enumerate} \item A working \TeX{} distribution (such as TeXLive for linux and mac and MiKTex for Windows) \item The java command line interpreter (i.e. the \code{java} command). This is standard on most systems and is free to download otherwise. \item At least version 2.00 of the \pkg{PGF/Ti\textit{k}Z} package for \LaTeX{}. \end{enumerate} That should be it for any *nix or Mac OS X system. \subsection{Windows specific requirements} The \pkg{pgfSweave} package can work on Windows with some special care. First of all it is strongly recommended that R be installed in a location that does not have spaces in its path name such as \texttt{C:$\backslash$R}. This will save much grief when using \pkg{Sweave}. In addition, do the following in the order listed. \begin{enumerate} \item Install Java. \item Install MiK\TeX{}. \item Upgrade to or install PGF 2.0 if not already done. \item Install Rtools\footnote{\url{http://www.murdoch-sutherland.com/Rtools/}}. Make sure to allow the Rtools installer to modify your PATH. \end{enumerate} If everything is set up correctly, the commands \code{java} and \code{pdflatex} or \code{latex} should be available at the command prompt. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Usage} We assume a familiarity with the usage of \pkg{Sweave}, for more information see the \pkg{Sweave} manual.\footnote{\url{http://www.stat.uni-muenchen.de/~leisch/Sweave/Sweave-manual.pdf}} This section will explain the usage of the \code{tikz}, \code{pgf} and \code{external} options and then provide a complete example. \subsection{The \code{tikz} option} The first new code chunk option, \code{tikz}, acts the same as the \code{pdf} or \code{eps} options but instead of resulting in an \code{$\backslash$includegraphics\{\}} statement the esult is an \code{$\backslash$input\{\}} statement. Consider the following code: \begin{minipage}[!ht]{.5\linewidth} Input: \begin{Verbatim}[frame=single] \begin{figure}[ht] <>= x <- rnorm(100) plot(x) @ \caption{caption} \label{fig:tikz-option} \end{figure} \end{Verbatim} \end{minipage} \begin{minipage}[!ht]{.5\linewidth} Output: \begin{Verbatim}[frame=single] \begin{figure}[ht] \input{tikz-option.tikz} \caption{caption} \label{fig:tikz-option} \end{figure} \end{Verbatim} \end{minipage} \vspace{.5cm} The \code{.tikz} file is generated with the \pkg{tikzDevice} package. {\color{red} This is the default graphics output for \pkg{pgfSweave}, the \code{tikz} option is set to \code{TRUE} by default.} \subsection{The \code{pgf} option} The second new code chunk option \code{pgf}, acts the same as the tikz option in that the result is an \code{$\backslash$input\{\}} statement. Consider the following code: \begin{minipage}[!ht]{.5\linewidth} Input: \begin{Verbatim}[frame=single] \begin{figure}[ht] <>= x <- rnorm(100) plot(x) @ \caption{caption} \label{fig:pgf-option} \end{figure} \end{Verbatim} \end{minipage} \begin{minipage}[!ht]{.5\linewidth} Output: \begin{Verbatim}[frame=single] \begin{figure}[ht] \input{pgf-option.pgf} \caption{caption} \label{fig:pgf-option} \end{figure} \end{Verbatim} \end{minipage} \vspace{.5cm} The \code{.pgf} file is generated with the \pkg{eps2pgf} utility. The \code{postscript} graphics device is used first to generate a \code{.eps} file. Then the command \begin{verbatim}$ java -jar /path/to/eps2pgf.jar -m directcopy graphic.eps\end{verbatim} is run on every code chunk that has \code{fig=TRUE} and \code{pgf=TRUE}. We do not recommend using this option in favor of the \code{tikz} option. Using the \code{pgf} option involves two creation steps instead of one and it strips the \lang{R} text styles (such as boldface). \subsection{The \code{external} option} The external option is the interface to the graphics caching mechanism in \pkg{pgfSweave}. This option will wrap your graphics output in \code{$\backslash$beginpgfgraphicnamed} and \code{$\backslash$endpgfgraphicnamed}. \begin{minipage}[!ht]{.55\linewidth} Input: \begin{Verbatim}[frame=single] \begin{figure}[ht] <>= x <- rnorm(100) plot(x) @ \caption{caption} \label{fig:external-option} \end{figure} \end{Verbatim} \end{minipage} \begin{minipage}[!ht]{.45\linewidth} Output: \begin{Verbatim}[frame=single] \begin{figure}[ht] \beginpgfgraphicnamed{external} \input{external.tikz} \endpgfgraphicnamed \caption{caption} \label{fig:external} \end{figure} \end{Verbatim} \end{minipage} When a graphic is newly created or when it has changed, \pkg{pgfSweave} will generate a command for externalizing that graphic in the shell script \code{$<$filename$>$.sh}. This follows the process outlined in the externalization section of the pgf manual. After the \pkg{Sweave} process is done the externalization commands are run. This will create separate image files for each graphic. On later compilations this image file will simply be included instead of being regenerated. {\color{red}NOTE:} \code{$\backslash$pgfrealjobname\{$<$basefilename$>$\}} in the header of your document otherwise externalization will not work! For example if you document is \code{main.Rnw} then your header should contain the line \code{$\backslash$pgfrealjobname\{main\}}. Did we mention you should read the pgf manual? \subsubsection{The Externalization Driver} The option \code{tex.driver} not very well publicized, but it controls which engine (\LaTeX{},PDF\LaTeX{}, Xe\LaTeX{}) is used. This way, the externalization feature can be used to generate eps and pdf external files. \subsubsection{Compilation Time} The combination of \pkg{cacheSweave} code caching and \pkg{pgfSweave} figure caching can provide drastic decrease in compilation time. The time speedup is highly dependednt on what code you are executing but using \pkg{pgfSweave} effectivly reduces the compilation time of \pkg{Sweave} to the time it takes to compile the \LaTeX{} document. \subsection{A Complete Example} At this point we will provide a complete example. The example from the \pkg{Sweave} manual is used to highlight the differences. The two frame below show the input Sweave file \texttt{pgfSweave-example-Rnw.in} and the resulting tex file \texttt{pgfSweave-example-tex.in}. \VerbatimInput[frame=single,label={pgfSweave-example-Rnw.in},labelposition=all]{pgfSweave-example-Rnw.in} On the input file run: \begin{Verbatim} R> library(pgfSweave) R> pgfSweave('example.Rnw',pdf=T) \end{Verbatim} or \begin{Verbatim} $ R CMD pgfsweave example.Rnw \end{Verbatim} And we get: \VerbatimInput[frame=single,label={pgfSweave-example-tex.in},labelposition=all]{pgfSweave-example-tex.in} \begin{figure}[!hp] \framebox{\includegraphics[width=\textwidth]{pgfSweave-example.pdf}} \end{figure} \clearpage %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{The Process} The process that \pkg{pgfSweave} uses when caching and externalization are turned on is outlined in the flow chart below: \begin{figure}[!ht] \centering \begin{tikzpicture} [ node distance=4mm and 9mm, scale=.8, block/.style ={ rectangle, draw=gray!80, thick, top color=gray!20, bottom color=white, text badly centered, text width=7em }, decision/.style={ diamond, draw=gray!80, thick, top color=gray!20, bottom color=white, text width=5em, text centered, inner sep=0pt }, a/.style={ -stealth', draw=gray } ] \node (init)[block,text width=10em,rounded corners,] {Examine code chunk}; \node (past) [decision,below=of init] {Has the code chunk changed from a previous run?}; \node (run) [block,right=of past] {Run the chunk and cache the results}; \node (lazy) [block,below=of past] {Lazyload the results}; \node (plotting) [decision,below=of lazy] {Did the chunk do any plotting?}; \node (move) [block,left=of plotting] {Move on to next chunk}; \node (extern)[decision,below=of plotting] {Is the graphic non-existant or has the chunk changed?}; \node (doextern)[block,right=of extern] {Generate the graphic and the extenaliztion commands}; \node (out)[decision,below=of extern] {Out of chunks?}; \node (end)[block,text width=25em,rounded corners,below=of out] {End (Still need to run externalization commands)}; \draw[a] (init) edge (past); \draw[a] (past) edge node [above] {yes} (run); \draw[a] (past) edge node [left] {no} (lazy); \draw[a] (lazy) -- (plotting); \draw[a] (run) |- (plotting); \draw[a] (plotting) edge node [left] {yes} (extern); \draw[a] (plotting) -- node [above] {no} (move); \draw[a] (extern) edge node [above] {yes} (doextern); \draw[a] (extern) -| node [left] {yes} (move); \draw[a] (extern) edge node [left] {no} (out); \draw[a] (out) -| node [left] {no} (move); \draw[a] (doextern) |- (out); \draw[a] (out) -- (end); \draw[a] (move) |- (past); \end{tikzpicture} \caption{Flow chart of modeling procedure.}\label{flow} \end{figure} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Consistency in style between graphics and text} %% initial calculations In Figure \ref{normalSweave}, notice the inconsistency in font and size between the default \lang{R} output and the default \LaTeX{} output. Fonts and font sizes can be changed from \lang{R} but it is hard to be precise. What if you decide to change the font and and point size of your entire document? In Figure \ref{pgfSweave-hist} and \ref{pgfSweave-tikz-hist}the text is consistent with the rest of the document. \begin{figure}[!ht] \begin{minipage}{.45\linewidth} \centering \includegraphics{figs/fig-normalSweave} \caption{This is normal \pkg{Sweave}.}\label{normalSweave} \end{minipage} \begin{minipage}[!ht]{.45\linewidth} %% pgf file will get regenerated every time slowing down the whole compilation. %% even though cache=TRUE. \centering \beginpgfgraphicnamed{figs/fig-pgfSweave-hist} \input{figs/fig-pgfSweave-hist.pgf} \endpgfgraphicnamed \caption{This is from \pkg{pgfSweave} with the \code{pgf} option.}\label{pgfSweave-hist} \end{minipage} \end{figure} \begin{figure}[!ht] \centering %% pgf file will get regenerated every time slowing down the whole compilation. %% even though cache=TRUE. \centering \beginpgfgraphicnamed{figs/fig-pgfSweave-tikz-hist} \input{figs/fig-pgfSweave-tikz-hist.tikz} \endpgfgraphicnamed \caption{This is from \pkg{pgfSweave} with the \code{tikz} option.}\label{pgfSweave-tikz-hist} \end{figure} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Sweave graphic width defaults} The default in \code{Sweave.sty} is to fix the width of every image to 80\% of the text width by using \verb"\setkeys{Gin}{width=.8\textwidth}". Say you have a 7 in text width and code chunk where you set \code{width=4}. The original 4 inch wide graphic will have text size matching your document but when it is included in your document it will be scaled up to 7 inched wide and the text will get bigger! This default is quite contrary to the philosophy of \pkg{pgfSweave}. There are two ways around this before each code chunk you can set \verb"\setkeys{Gin}{width=}". Alternatively (and the recommended way) you can turn off this feature globally by using \verb"\usepackage[nogin]{Sweave}", that way the width and height of the figure are controlled by the arguments to the code chunk. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Command line interface}\label{commandline} In recent versions, \pkg{pgfSweave} got an \code{R CMD} command line interface. On Unix alikes (including Mac OS X) a symbolic link \code{\$R\_HOME/bin/pgfsweave} to \code{\$R\_HOME/library/pgfSweave/exec/pgfsweave-script.R}. On Windows a copy of the script is made instead. This script requires the \pkg{getopt} package. Here is a listing from \code{R CMD pgfsweave --help}: \begin{Verbatim}[frame=single] Usage: R CMD pgfsweave [options] file A simple front-end for pgfSweave() Options: -h, --help print short help message and exit -v, --version print pgfSweave version info and exit -d, --dvi dont use texi2dvi option pdf=T i.e. call latex (defalt is pdflatex) -s, --pgfsweave-only dont compile to pdf/dvi, only run Sweave -n, --graphics-only dont use the texi2dvi() funciton in R, compile graphics only ; ignored if --pgfsweave-only is used Package repositories: http://r-forge.r-project.org/projects/pgfsweave/ \end{Verbatim} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Frequently Asked Questions} %-------------------------------------- %-------------------------------------- \ques{Can \pkg{pgfSweave} be run from the command line?} Sure! See section \ref{commandline}. \begin{Verbatim} $ R CMD pgfsweave .Rnw \end{Verbatim} %-------------------------------------- %-------------------------------------- \ques{The changes to my code chunk are not being recognized.} Occasionally \pkg{pgfSweave} suffers from overzealous caching. In these cases it may be necessary to manually delete the cache or the figure files. This is something we need to improve. %-------------------------------------- %-------------------------------------- \ques{How do I set subdirectories for figures and caches?} This is straight out of the \pkg{Sweave} and \pkg{cacheSweave} manuals (nothing new here). For a figures subdirectory \footnote{make sure to create the directory first!} use the \code{prefix.string} option: \begin{verbatim}\SweaveOpts{prefix.string=figs/fig}\end{verbatim} For a caching subdirectory use a code chunk at the beginning or your document like: \begin{verbatim} <>= setCacheDir("cache") @ \end{verbatim} %-------------------------------------- %-------------------------------------- \ques{Why are the width and height options being ignored?} This is another one from \pkg{Sweave}. You must use the \code{nogin} option in \code{Sweave.sty} for the width and height parameters to actually affect the size of the image in the document: \begin{verbatim}\usepackage[nogin]{Sweave}\end{verbatim} %-------------------------------------- %-------------------------------------- \ques{\LaTeX{}/PDF\LaTeX{} is not found in R.app (Mac OS X) and [Possibly] R.exe (Windows)} Your latex program is not in the default search path. Put a line such as: \begin{verbatim}Sys.setenv("PATH" = paste(Sys.getenv("PATH"),"/usr/texbin",sep=":"))\end{verbatim} in your \verb".Rprofile" file. %-------------------------------------- %-------------------------------------- \ques{I get a bunch of ``Incompatible list can't be unboxed'' errors when compiling.} This is a problem with PGF. The workaround is to load the \pkg{atbegshi} package before PGF or TikZ: \begin{verbatim} \usepackage{atbegshi} \usepackage{pgf} \end{verbatim} or \begin{verbatim} \usepackage{atbegshi} \usepackage{tikz} \end{verbatim} %-------------------------------------- %-------------------------------------- \ques{The vignette in \texttt{/inst/doc/} does not contain any code chunks!} That is because the vignette in \texttt{/inst/doc/} is a ``fake'' vignette generated from the ``real'' vignette in \texttt{/inst/misc/vignette-src/}. The reason for this extra step is that package vignettes must be able to be compiled with \texttt{R CMD Sweave}, which is precisely what we don't want to use! To compile the vignette yourself, download the package source, unpack it and then do the following: \begin{verbatim} $ R CMD INSTALL pgfSweave $ cd pgfSweave/inst/misc/vignette-src/ $ make \end{verbatim} Which will create \code{pgfSweave-vignette-source.pdf} \end{document}