Generating Rd files

Roxygen process

There are three steps in the transformation from roxygen comments in your source file to human readable documentation:

  1. You add roxygen comments to your source file.
  2. roxygen2::roxygenise() converts roxygen comments to .Rd files.
  3. R converts .Rd files to human readable documentation

The process starts when you add specially formatted roxygen comments to your source file. Roxygen comments start with #' so you can continue to use regular comments for other purposes.

#' Add together two numbers
#'
#' @param x A number
#' @param y A number
#' @return The sum of \code{x} and \code{y}
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
  x + y
}

For the example, above, this will generate man/add.Rd that looks like:

% Generated by roxygen2 (3.2.0): do not edit by hand
\name{add}
\alias{add}
\title{Add together two numbers}
\usage{
add(x, y)
}
\arguments{
  \item{x}{A number}

  \item{y}{A number}
}
\value{
The sum of \code{x} and \code{y}
}
\description{
Add together two numbers
}
\examples{
add(1, 1)
add(10, 1)
}

Rd files are a special file format loosely based on LaTeX. You can read more about the Rd format in the R extensions manual. I'll avoid discussing Rd files as much as possible, focussing instead on what you need to know about roxygen2.

When you use ?x, help("x") or example("x") R looks for an Rd file containing \alias{x}. It then parses the file, converts it into html and displays it.

All of these functions look for an Rd file in installed packages. This isn't very useful for package development, because you want to use the .Rd files in the source package. devtools provides two helpful functions for this scenario: dev_help() and dev_example(). They behave similarly to help() and example() but look in source packages you've loaded with load_all(), not installed packages you've loaded with library().

Basic documentation

Roxygen comments start with #' and include tags like @tag details. Tags break the documentation up into pieces, and the content of a tag extends from the end of tag name to the start of the next tag (or the end of the block). Because @ has a special meaning in roxygen, you need to write @@ to add a literal @ to the documentation.

Each documentation block starts with some text. The first sentence becomes the title of the documentation. That's what you see when you look at help(package = mypackage) and is shown at the top of each help file. It should fit on one line, be written in sentence case, and end in a full stop. The second paragraph is the description: this comes first in the documentation and should briefly describe what the function does. The third and subsequent paragraphs go into the details: this is a (often long) section that comes after the argument description and should provide any other important details of how the function operates.

Here's an example showing what the documentation for sum() might look like if it had been written with roxygen:

#' Sum of vector elements.
#'
#' \code{sum} returns the sum of all the values present in its arguments.
#'
#' This is a generic function: methods can be defined for it directly
#' or via the \code{\link{Summary}} group generic. For this to work properly,
#' the arguments \code{...} should be unnamed, and dispatch is on the
#' first argument.
sum <- function(..., na.rm = TRUE) {}

\code{} and \link{} are .Rd formatting commands which you'll learn more about in formatting. Also notice the wrapping of the roxygen block. You should make sure that your comments are less than ~80 columns wide.

The following documentation produces the same help file as above, but uses explicit tags. You only need explicit tags if you want to the title or description to span multiple paragraphs (a bad idea), or want to omit the description (in which case roxygen will use the title for the description, since it's a required documentation component).

#' @title Sum of vector elements.
#'
#' @description
#' \code{sum} returns the sum of all the values present in its arguments.
#'
#' @details
#' This is a generic function: methods can be defined for it directly
#' or via the \code{\link{Summary}} group generic. For this to work properly,
#' the arguments \code{...} should be unnamed, and dispatch is on the
#' first argument.
sum <- function(..., na.rm = TRUE) {}

All objects must have a title and description. Details are optional.

Enhancing navigation.

There are two tags that make it easier for people to navigate around your documentation: @seealso and @family.

@seealso allows you to point to other useful resources, either on the web \url{http://www.r-project.org}, or to other documentation with \code{\link{functioname}}.

If you have a family of related functions, you can use the @family <family> tag to automatically add appropriate lists and interlinks to the @seealso section. Because it will appear as “Other :”, the @family name should be plural (i.e., “model building helpers” not “model building helper”). You can make a function a member of multiple families by repeating the @family tag for each additional family. These will then get seaprate headings in the seealso section.

For sum, these components might look like:

#' @family aggregate functions
#' @seealso \code{\link{prod}} for products, \code{\link{cumsum}} for
#'  cumulative sums, and \code{\link{colSums}}/\code{\link{rowSums}}
#'  marginal sums over high-dimensional arrays.

Three other tags make it easier for the user to find documentation:

You use other tags based on the type of object that you're documenting. The following sections describe the most commonly used tags for functions, S3, S4 and RC objects and data.

Documenting functions

Functions are the mostly commonly documented objects. Most functions use three tags:

We could use these new tags to improve our documentation of sum() as follows:

#' Sum of vector elements.
#'
#' \code{sum} returns the sum of all the values present in its arguments.
#'
#' This is a generic function: methods can be defined for it directly
#' or via the \code{\link{Summary}} group generic. For this to work properly,
#' the arguments \code{...} should be unnamed, and dispatch is on the
#' first argument.
#'
#' @param ... Numeric, complex, or logical vectors.
#' @param na.rm A logical scalar. Should missing values (including NaN)
#'   be removed?
#' @return If all inputs are integer and logical, then the output
#'   will be an integer. If integer overflow
#'   \url{http://en.wikipedia.org/wiki/Integer_overflow} occurs, the output
#'   will be NA with a warning. Otherwise it will be a length-one numeric or
#'   complex vector.
#'
#'   Zero-length vectors have sum 0 by definition. See
#'   \url{http://en.wikipedia.org/wiki/Empty_sum} for more details.
#' @examples
#' sum(1:10)
#' sum(1:5, 6:10)
#' sum(F, F, F, T, T)
#'
#' sum(.Machine$integer.max, 1L)
#' sum(.Machine$integer.max, 1)
#'
#' \dontrun{
#' sum("a")
#' }
sum <- function(..., na.rm = TRUE) {}

Indent the second and subsequent lines of a tag so that when scanning the documentation so it's easy to see where one tag ends and the next begins. Tags that always span multiple lines (like @example) should start on a new line and don't need to be indented.

Documenting classes, generics and methods

Documenting classes, generics and methods are relatively straightforward, but there are some variations based on the object system. The following sections give the details for the S3, S4 and RC object systems.

S3

S3 generics are regular functions, so document them as such. S3 classes have no formal definition, so document the constructor function. It is your choice whether or not to document S3 methods. You don't need to document methods for simple generics like print(). If your method is more complicated, you should document it so people know what the parameters do. In base R, you can find documentation for more complex methods like predict.lm(), predict.glm(), and anova.glm().

Older versions of roxygen required explicit @method generic class tags for all S3 methods. From 3.0.0 this is no longer needed as and roxygen2 will figure it out automatically. If you are upgrading, make sure to remove these old tags. Automatic method detection will only fail if the generic and class are ambiguous. For example is all.equal.data.frame() the equal.data.frame method for all, or the data.frame method for all.equal?. If this happens, you can disambiguate with (e.g.) @method all.equal data.frame.

S4

Older versions of roxyen2 required explicit @usage, @alias and @docType to correctly document S4 objects, but from version 3.0.0 on roxygen2 generates correct metadata automatically. If you're upgrading from a previous version, make sure to remove these old tags.

S4 generics are also functions, so document them as such. Document S4 classes by adding a roxygen block before setClass(). Use @slot to document the slots of the class. Here's a simple example:

#' An S4 class to represent a bank account.
#'
#' @slot balance A length-one numeric vector
Account <- setClass("Account",
  slots = list(balance = "numeric")
)

S4 methods are a little more complicated. Unlike S3, all S4 methods must be documented. You can document them in three places:

Use either @rdname or @describeIn to control where method documentation goes. See the next section for more details.

RC

RC is different to S3 and S4 because methods are associated with classes, not generics. RC also has a special convention for documenting methods: the docstring. This makes documenting RC simpler than S4 because you only need one roxygen block per class.

#' A Reference Class to represent a bank account.
#'
#' @field balance A length-one numeric vector
Account <- setRefClass("Account",
  fields = list(balance = "numeric"),
  methods = list(
    withdraw = function(x) {
      "Withdraw money from account. Allows overdrafts"
      balance <<- balance - x
    }
  )
)

Methods with doc strings will be included in the “Methods” section of the class documentation. Each documented method will be listed with an automatically generated usage statement and its doc string.

Documenting datasets

Datasets are usually stored as .rdata files in data/ and not as regular R objects in the package. This means you need document them slightly differently: instead of documenting the data directly, you document NULL, and use @name to tell roxygen2 what dataset you're really documenting.

There are two additional tags that are useful for documenting datasets:

To show how everything fits together, the example below is an excerpt from the roxygen block used to document the diamonds dataset in ggplot2.

#' Prices of 50,000 round cut diamonds.
#'
#' A dataset containing the prices and other attributes of almost 54,000
#' diamonds. The variables are as follows:
#'
#' \itemize{
#'   \item price. price in US dollars (\$326--\$18,823)
#'   \item carat. weight of the diamond (0.2--5.01)
#'   ...
#' }
#'
#' @format A data frame with 53940 rows and 10 variables
#' @source \url{http://www.diamondse.info/}
#' @name diamonds
NULL

Documenting packages

As well as documenting every exported object in the package, you should also document the package itself. Relatively few packages provide package documentation, but it's an extremely useful tool for users, because instead of just listing functions like help(package = pkgname) it organises them and shows the user where to get started.

Package documentation should describe the overall purpose of the package and point to the most important functions. It should not contain a verbatim list of functions or copy of DESCRIPTION. This file is for human reading, so pick the most important elements of your package.

Package documentation should be placed in pkgname.R. Here's an example:

#' Generate R documentation from inline comments.
#'
#' Roxygen2 allows you to write documentation in comment blocks co-located
#' with code.
#'
#' The only function you're likely to need from \pkg{roxygen2} is
#' \code{\link{roxygenize}}. Otherwise refer to the vignettes to see
#' how to format the documentation.
#'
#' @docType package
#' @name roxygen2
NULL

Some notes:

Package documentation is a good place to list all options() that a package understands and to document their behaviour. Put in a section called “Package options”, as described below.

Do repeat yourself

There is a tension between the DRY (do not repeat yourself) principle of programming and the need for documentation to be self-contained. It's frustrating to have to navigate through multiple help files in order to pull together all the pieces you need. Roxygen2 provides three ways to avoid repeating yourself in code documentation, while assembling information from multiple places in one documentation file:

Roxygen templates

Roxygen templates are R files containing only roxygen comments that live in the man-roxygen directory. Use @template file-name (without extension) to insert the contents of a template into the current documentation.

You can make templates more flexible by using template variables defined with @templateVar name value. Template files are run with brew, so you can retrieve values (or execute any other arbitrary R code) with <%= name %>.

Note that templates are parsed a little differently to regular blocks, so you'll need to explicitly set the title, description and details with @title, @description and @details.

Inheriting parameters from other functions

You can inherit parameter descriptions from other functions using @inheritParams source_function. This tag will bring in all documentation for parameters that are undocumented in the current function, but documented in the source function. The source can be a function in the current package, @inheritParams function, or another package using @inheritParams package::function.

Note, however, that inheritance does not chain. In other words, the source_function must always be the function that defines the parameter using @param.

Documenting multiple functions in the same file

You can document multiple functions in the same file by using either @rdname or @describeIn tag. It's a technique best used with caution: documenting too many functions into one place leads to confusing documentation. It's best used when all functions have the same (or very similar) arguments.

@describeIn is designed for the most common cases:

It generates a new section, named either “Methods (by class)”, “Methods (by generic)” or “Functions”. The section contains a bulleted list describing each function, labelled so that you know what function or method it's talking about. Here's an example, documenting an imaginary new generic:

#' Foo bar generic
#'
#' @param x Object to foo.
foobar <- function(x) UseMethod("x")

#' @describeIn foobar Difference between the mean and the median
foobar.numeric <- function(x) abs(mean(x) - median(x))

#' @describeIn foobar First and last values pasted together in a string.
foobar.character <- function(x) paste0(x[1], "-", x[length(x)])

An alternative to @describeIn is @rdname. It overrides the default file name generated by roxygen and merges documentation for multiple objects into one file. This gives you complete freedom to combine documentation however you see fit. There are two ways to use @rdname. You can add documentation to an existing function:

#' Basic arithmetic
#'
#' @param x,y numeric vectors.
add <- function(x, y) x + y

#' @rdname add
times <- function(x, y) x * y

Or, you can create a dummy documentation file by documenting NULL and setting an informative @name.

#' Basic arithmetic
#'
#' @param x,y numeric vectors.
#' @name arith
NULL

#' @rdname arith
add <- function(x, y) x + y

#' @rdname arith
times <- function(x, y) x * y

Sections

You can add arbitrary sections to the documentation for any object with the @section tag. This is a useful way of breaking a long details section into multiple chunks with useful headings. Section titles should be in sentence case and must be followed a colon. Titles may only take one line.

#' @section Warning:
#' Do not operate heavy machinery within 8 hours of using this function.

To add a subsection, you must use the Rd \subsection{} command, as follows:

#' @section Warning:
#' You must not call this function unless ...
#'
#' \subsection{Exceptions}{
#'    Apart from the following special cases...
#' }