forked from hadley/r-pkgs
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathman.rmd
596 lines (429 loc) · 25.3 KB
/
man.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
---
title: Documenting functions
layout: default
output: bookdown::html_chapter
---
# Introduction to roxygen2
Documentation is one of the most important aspects of good code. Without it, users won't know how to use your package, and are unlikely to do so. Documentation is also useful for you in the future (so you remember what the heck you were thinking!), and for other developers working on your package. The goal of roxygen2 is to make documenting your code as easy as possible. R provides a standard way of documenting packages: you write `.Rd` files in the `man/` directory. These files use a custom syntax, loosely based on latex. Roxygen2 provides a number of advantages over writing `.Rd` files by hand:
* Code and documentation are adjacent so when you modify your code, it's easy
to remember that you need to update the documentation.
* Roxygen2 dynamically inspects the objects that it's documenting, so it
can automatically add data that you'd otherwise have to write by hand.
* It abstracts over the differences in documenting S3 and S4 methods,
generics and classes so you need to learn fewer details.
As well as generating `.Rd` files, roxygen will also create a `NAMESPACE` for you, and will manage the `Collate` field in `DESCRIPTION`.
Roxygen is used in two other places:
* [Managing your `NAMESPACE`](#namespace) describes how to generate
a `NAMESPACE` file, how namespacing works in R, and how you can use Roxygen2 to be
specific about what your package needs and supplies.
* [Controlling collation order](#collate) describes how roxygen2
controls file loading order if you need to make sure one file is
loaded before another.
## Running roxygen
There are three main ways to run roxygen:
* `roxygen2::roxygenise()`, or
* `devtools::document()`, if you're using devtools, or
* `Ctrl + Shift + D`, if you're using RStudio.
As of version 4.0.0, roxygen2 will never overwrite a file it didn't create. It does this by labelling every file it creates with a comment: "Generated by roxygen2 (version): do not edit by hand".
## Help
You've probably used help a lot, but you might not be aware of the more advanced features:
* `package?lubridate`
* `class?myclass`
* `methods?`
* `method?`
* `method?combo("numeric", "numeric")`
* `?combo(1:10, letters)`
Roxygen automatically takes care of generating the special aliases needed to make these lookups work.
How does help work? It finds the matching Rd file, and compiles it to either text or HTML output. It's complicated by the fact that binary packages don't include individual Rd files, they actually included a pre-parsed database of Rd files.
## Roxygen process
There are three steps in the transformation from roxygen comments in your source file to human readable documentation:
1. You add roxygen comments to your source file.
2. `roxygen2::roxygenise()` converts roxygen comments to `.Rd` files.
3. R converts `.Rd` files to human readable documentation
The process starts when you add specially formatted roxygen comments to your source file. Roxygen comments start with `#'` so you can continue to use regular comments for other purposes.
```{r}
#' Add together two numbers
#'
#' @param x A number
#' @param y A number
#' @return The sum of \code{x} and \code{y}
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
x + y
}
```
For the example, above, this will generate `man/add.Rd` that looks like:
```
% Generated by roxygen2 (3.2.0): do not edit by hand
\name{add}
\alias{add}
\title{Add together two numbers}
\usage{
add(x, y)
}
\arguments{
\item{x}{A number}
\item{y}{A number}
}
\value{
The sum of \code{x} and \code{y}
}
\description{
Add together two numbers
}
\examples{
add(1, 1)
add(10, 1)
}
```
Rd files are a special file format loosely based on LaTeX. You can read more about the Rd format in the [R extensions](http://cran.r-project.org/doc/manuals/R-exts.html#Rd-format) manual. I'll avoid discussing Rd files as much as possible, focussing instead on what you need to know about roxygen2.
When you use `?x`, `help("x")` or `example("x")` R looks for an Rd file containing `\alias{x}`. It then parses the file, converts it into html and displays it.
All of these functions look for an Rd file in _installed_ packages. This isn't very useful for package development, because you want to use the `.Rd` files in the _source_ package. `devtools` provides two helpful functions for this scenario: `dev_help()` and `dev_example()`. They behave similarly to `help()` and `example()` but look in source packages you've loaded with `load_all()`, not installed packages you've loaded with `library()`.
## Basic documentation
Roxygen comments start with `#'` and include tags like `@tag details`. Tags break the documentation up into pieces, and the content of a tag extends from the end of tag name to the start of the next tag (or the end of the block). Because `@` has a special meaning in roxygen, you need to write `@@` to add a literal `@` to the documentation.
Each documentation block starts with some text. The first sentence becomes the title of the documentation. That's what you see when you look at `help(package = mypackage)` and is shown at the top of each help file. It should fit on one line, be written in sentence case, and end in a full stop. The second paragraph is the description: this comes first in the documentation and should briefly describe what the function does. The third and subsequent paragraphs go into the details: this is a (often long) section that comes after the argument description and should provide any other important details of how the function operates.
Here's an example showing what the documentation for `sum()` might look like if it had been written with roxygen:
```{r}
#' Sum of vector elements.
#'
#' \code{sum} returns the sum of all the values present in its arguments.
#'
#' This is a generic function: methods can be defined for it directly
#' or via the \code{\link{Summary}} group generic. For this to work properly,
#' the arguments \code{...} should be unnamed, and dispatch is on the
#' first argument.
sum <- function(..., na.rm = TRUE) {}
```
`\code{}` and `\link{}` are `.Rd` formatting commands which you'll learn more about in [formatting](#text-formatting). Also notice the wrapping of the roxygen block. You should make sure that your comments are less than ~80 columns wide.
The following documentation produces the same help file as above, but uses explicit tags. You only need explicit tags if you want to the title or description to span multiple paragraphs (a bad idea), or want to omit the description (in which case roxygen will use the title for the description, since it's a required documentation component).
```{r}
#' @title Sum of vector elements.
#'
#' @description
#' \code{sum} returns the sum of all the values present in its arguments.
#'
#' @details
#' This is a generic function: methods can be defined for it directly
#' or via the \code{\link{Summary}} group generic. For this to work properly,
#' the arguments \code{...} should be unnamed, and dispatch is on the
#' first argument.
sum <- function(..., na.rm = TRUE) {}
```
All objects must have a title and description. Details are optional.
There are two tags that make it easier for people to navigate around your documentation. `@seealso` allows you to point to other useful resources, either on the web `\url{http://www.r-project.org}`, or to other documentation with `\code{\link{functioname}}`. If you have a family of related functions that all needed to be interlinked, you can use the `@family` tag to automatically add the appropriate links to `@seealso`. The `@family` name should be plural.
For sum, these components might look like:
```{r}
#' @family aggregate functions
#' @seealso \code{\link{prod}} for products, \code{\link{cumsum}} for
#' cumulative sums, and \code{\link{colSums}}/\code{\link{rowSums}}
#' marginal sums over high-dimensional arrays.
```
Three other tags make it easier for the user to find documentation:
* `@aliases space separated aliases` to add additional aliases, through
which the user can find the documentation with `?`.
* `@concepts` to add extra keywords that will be found with `help.search()`
* `@keywords keyword1 keyword2 ...` to add standardised keywords. Keywords are
optional, but if present, must be taken from a predefined list. Keywords are
not very useful, except for `@keywords internal`. Using the internal keyword
removes all functions in the associated `.Rd` file from the documentation
index and disables some of their automated tests.
A common use case is to both export a function (using `@export`) and
marking it as internal. That way, advanced users can access a function that
new users would be confused about if they were to see it in the index.
You use other tags based on the type of object that you're documenting. The following sections describe the most commonly used tags for functions, S3, S4 and RC objects and data.
## Documenting functions
Functions are the most commonly documented objects. Most functions use three tags:
* `@param name description` describes the inputs to the function.
The description should provide a succinct summary of the type of the
parameter (e.g. a string, a numeric vector), and if not obvious from
the name, what the parameter does. The description should start with a
capital letter and end with a full stop. It can span multiple lines (or
even paragraphs) if necessary. All parameters must be documented.
You can document multiple arguments in one place by separating
the names with commas (no spaces). For example, to document both
`x` and `y`, you can say `@param x,y Numeric vectors`.
* `@examples` provides executable R code showing how to use the function in
practice. This is a very important part of the documentation because
many people look at the examples before reading anything. Example code
must work without errors as it is run automatically as part of `R CMD
check`.
However for the purpose of illustration, it's often useful to include code
that causes an error. `\dontrun{}` allows you to include code in the
example that is never used. There are two other special commands.
`\dontshow{}` is run, but not shown in the help page: this can
be useful for informal tests. `\donttest{}` is run in examples,
but not run automatically in `R CMD check`. This is useful if you
have examples that take a long time to run. The options are summarised
below.
Command | example | help | R CMD check
------------ | --------- | ------ | -----------
`\dontrun{}` | | x |
`\dontshow{}`| x | | x
`\donttest{}`| x | x |
Instead of including examples directly in the documentation, you can
put them in separate files and use `@example path/relative/to/packge/root`
to insert them into the documentation.
* `@return description` describes the output from the function. This is
not always necessary, but is a good idea if you return different types
of outputs depending on the input, or you're returning an S3, S4 or RC
object.
We could use these new tags to improve our documentation of `sum()` as follows:
```{r}
#' Sum of vector elements.
#'
#' \code{sum} returns the sum of all the values present in its arguments.
#'
#' This is a generic function: methods can be defined for it directly
#' or via the \code{\link{Summary}} group generic. For this to work properly,
#' the arguments \code{...} should be unnamed, and dispatch is on the
#' first argument.
#'
#' @param ... Numeric, complex, or logical vectors.
#' @param na.rm A logical scalar. Should missing values (including NaN)
#' be removed?
#' @return If all inputs are integer and logical, then the output
#' will be an integer. If integer overflow
#' \url{http://en.wikipedia.org/wiki/Integer_overflow} occurs, the output
#' will be NA with a warning. Otherwise it will be a length-one numeric or
#' complex vector.
#'
#' Zero-length vectors have sum 0 by definition. See
#' \url{http://en.wikipedia.org/wiki/Empty_sum} for more details.
#' @examples
#' sum(1:10)
#' sum(1:5, 6:10)
#' sum(F, F, F, T, T)
#'
#' sum(.Machine$integer.max, 1L)
#' sum(.Machine$integer.max, 1)
#'
#' \dontrun{
#' sum("a")
#' }
sum <- function(..., na.rm = TRUE) {}
```
Indent the second and subsequent lines of a tag so that when scanning the documentation so it's easy to see where one tag ends and the next begins. Tags that always span multiple lines (like `@example`) should start on a new line and don't need to be indented.
## Documenting classes, generics and methods
Documenting classes, generics and methods are relatively straightforward, but there are some variations based on the object system. The following sections give the details for the S3, S4 and RC object systems.
### S3
S3 __generics__ are regular functions, so document them as such. S3 __classes__ have no formal definition, so document the constructor function. It is your choice whether or not to document S3 __methods__. You don't need to document methods for simple generics like `print()`. If your method is more complicated, you should document it so people know what the parameters do. In base R, you can find documentation for more complex methods like `predict.lm()`, `predict.glm()`, and `anova.glm()`.
Older versions of roxygen required explicit `@method generic class` tags for all S3 methods. From 3.0.0 this is no longer needed as and roxygen2 will figure it out automatically. If you are upgrading, make sure to remove these old tags. Automatic method detection will only fail if the generic and class are ambiguous. For example is `all.equal.data.frame()` the `equal.data.frame` method for `all`, or the `data.frame` method for `all.equal`?. If this happens, you can disambiguate with (e.g.) `@method all.equal data.frame`.
### S4
Older versions of roxyen2 required explicit `@usage`, `@alias` and `@docType` to correctly document S4 objects, but from version 3.0.0 on roxygen2 generates correct metadata automatically. If you're upgrading from a previous version, make sure to remove these old tags.
S4 __generics__ are also functions, so document them as such. Document __S4 classes__ by adding a roxygen block before `setClass()`. Use `@slot` to document the slots of the class. Here's a simple example:
```{r}
#' An S4 class to represent a bank account.
#'
#' @slot balance A length-one numeric vector
Account <- setClass("Account",
slots = list(balance = "numeric")
)
```
S4 __methods__ are a little more complicated. Unlike S3, all S4 methods must be documented. You can document them in three places:
* In the class. Most appropriate if the corresponding generic uses single
dispatch and you created the class.
* In the generic. Most appropriate if the generic uses multiple dispatch
and you control it.
* In its own file. Most appropriate if the method is complex. or the
either two options don't apply.
Use either `@rdname` or `@describeIn` to control where method documentation goes. See the next section for more details.
### RC
RC is different to S3 and S4 because methods are associated with classes, not generics. RC also has a special convention for documenting methods: the docstring. This makes documenting RC simpler than S4 because you only need one roxygen block per class.
```{r}
#' A Reference Class to represent a bank account.
#'
#' @field balance A length-one numeric vector
Account <- setRefClass("Account",
fields = list(balance = "numeric"),
methods = list(
withdraw = function(x) {
"Withdraw money from account. Allows overdrafts"
balance <<- balance - x
}
)
)
```
Methods with doc strings will be included in the "Methods" section of the class documentation. Each documented method will be listed with an automatically generated usage statement and its doc string.
## Documenting datasets {#documenting-data}
Datasets are usually stored as `.rdata` files in `data/` and not as regular R objects in the package. This means you need document them slightly differently: instead of documenting the data directly, you document `NULL`, and use `@name` to tell roxygen2 what dataset you're really documenting.
There are two additional tags that are useful for documenting datasets:
* `@format`, which gives an overview of the structure of the dataset.
If you omit this, roxygen will automatically add something based on the
first line of `str()` output
* `@source` where you got the data form, often a `\url{}`.
To show how everything fits together, the example below is an excerpt from the roxygen block used to document the diamonds dataset in ggplot2.
```{r}
#' Prices of 50,000 round cut diamonds.
#'
#' A dataset containing the prices and other attributes of almost 54,000
#' diamonds. The variables are as follows:
#'
#' \itemize{
#' \item price. price in US dollars (\$326--\$18,823)
#' \item carat. weight of the diamond (0.2--5.01)
#' ...
#' }
#'
#' @format A data frame with 53940 rows and 10 variables
#' @source \url{http://www.diamondse.info/}
#' @name diamonds
NULL
```
## Documenting packages
As well as documenting every exported object in the package, you should also document the package itself. Relatively few packages provide package documentation, but it's an extremely useful tool for users, because instead of just listing functions like `help(package = pkgname)` it organises them and shows the user where to get started.
Package documentation should describe the overall purpose of the package and point to the most important functions. It should not contain a verbatim list of functions or copy of `DESCRIPTION`. This file is for human reading, so pick the most important elements of your package.
Package documentation should be placed in `pkgname.R`. Here's an example:
```{r}
#' Generate R documentation from inline comments.
#'
#' Roxygen2 allows you to write documentation in comment blocks co-located
#' with code.
#'
#' The only function you're likely to need from \pkg{roxygen2} is
#' \code{\link{roxygenize}}. Otherwise refer to the vignettes to see
#' how to format the documentation.
#'
#' @docType package
#' @name roxygen2
NULL
```
Some notes:
* Like for datasets, there isn't a object that we can document directly so
document `NULL` and use `@name` to say what we're actually documenting
* `@docType package` indicates that the documentation is for the package.
This will automatically add the corect aliases so that both `?pkgname`
and `package?pkgname` will find the package help. If there's already
a function called `pkgname()`, use `@name roxygen2-package`.
* Use `@references` point to published material about the package that
users might find helpful.
Package documentation is a good place to list all `options()` that a package understands and to document their behaviour. Put in a section called "Package options", as described below.
## Do repeat yourself
There is a tension between the DRY (do not repeat yourself) principle of programming and the need for documentation to be self-contained. It's frustrating to have to navigate through multiple help files in order to pull together all the pieces you need. Roxygen2 provides three ways to avoid repeating yourself in code documentation, while assembling information from multiple places in one documentation file:
* create reusable with templates with `@template` and `@templateVar`
* reuse parameter documentation with `@inheritParams`
* document multiple functions in the same place with `@describeIn` or `@rdname`
### Roxygen templates
Roxygen templates are R files containing only roxygen comments that live in the `man-roxygen` directory. Use `@template file-name` (without extension) to insert the contents of a template into the current documentation.
You can make templates more flexible by using template variables defined with `@templateVar name value`. Template files are run with brew, so you can retrieve values (or execute any other arbitrary R code) with `<%= name %>`.
Note that templates are parsed a little differently to regular blocks, so you'll need to explicitly set the title, description and details with `@title`, `@description` and `@details`.
### Inheriting parameters from other functions
You can inherit parameter descriptions from other functions using `@inheritParams source_function`. This tag will bring in all documentation for parameters that are undocumented in the current function, but documented in the source function. The source can be a function in the current package, `@inheritParams function`, or another package using `@inheritParams package::function`.
Note, however, that inheritance does not chain. In other words, the `source_function` must always be the function that defines the parameter using `@param`.
### Documenting multiple functions in the same file
You can document multiple functions in the same file by using either `@rdname` or `@describeIn` tag. It's a technique best used with caution: documenting too many functions into one place leads to confusing documentation. It's best used when all functions have the same (or very similar) arguments.
`@describeIn` is designed for the most common cases:
* documenting methods in a generic
* documenting methods in a class
* documenting functions with the same (or similar arguments)
It generates a new section, named either "Methods (by class)", "Methods (by generic)" or "Functions". The section contains a bulleted list describing each function, labelled so that you know what function or method it's talking about. Here's an example, documenting an imaginary new generic:
```{r}
#' Foo bar generic
#'
#' @param x Object to foo.
foobar <- function(x) UseMethod("x")
#' @describeIn foobar Difference between the mean and the median
foobar.numeric <- function(x) abs(mean(x) - median(x))
#' @describeIn foobar First and last values pasted together in a string.
foobar.character <- function(x) paste0(x[1], "-", x[length(x)])
```
An alternative to `@describeIn` is `@rdname`. It overrides the default file name generated by roxygen and merges documentation for multiple objects into one file. This gives you complete freedom to combine documentation however you see fit. There are two ways to use `@rdname`. You can add documentation to an existing function:
```{r}
#' Basic arithmetic
#'
#' @param x,y numeric vectors.
add <- function(x, y) x + y
#' @rdname add
times <- function(x, y) x * y
```
Or, you can create a dummy documentation file by documenting `NULL` and setting an informative `@name`.
```{r}
#' Basic arithmetic
#'
#' @param x,y numeric vectors.
#' @name arith
NULL
#' @rdname arith
add <- function(x, y) x + y
#' @rdname arith
times <- function(x, y) x * y
```
## Sections
You can add arbitrary sections to the documentation for any object with the `@section` tag. This is a useful way of breaking a long details section into multiple chunks with useful headings. Section titles should be in sentence case and must be followed a colon. Titles may only take one line.
```{r}
#' @section Warning:
#' Do not operate heavy machinery within 8 hours of using this function.
```
To add a subsection, you must use the `Rd` `\subsection{}` command, as follows:
```{r}
#' @section Warning:
#' You must not call this function unless ...
#'
#' \subsection{Exceptions}{
#' Apart from the following special cases...
#' }
```
## Text formatting reference sheet {#text-formatting}
Within roxygen tags, you use `.Rd` syntax to format text. This vignette shows you examples of the most important commands. The full details are described in [R extensions](http://cran.r-project.org/doc/manuals/R-exts.html#Marking-text).
Note that `\` and `%` are special characters. To insert literals, escape with a backslash: `\\`, `\%`.
### Character formatting
* `\emph{italics}`
* `\strong{bold}`
* `\code{r_function_call(with = "arguments")}`, `\code{NULL}`, `\code{TRUE}`
* `\pkg{package_name}`
### Links
To other documentation:
* `\code{\link{function}}`: function in this package
* `\code{\link[MASS]{stats}}`: function in another package
* `\link[=dest]{name}`: link to dest, but show name
* `\linkS4class{abc}`: link to an S4 class
To the web:
* `\url{http://rstudio.com}`
* `\href{http://rstudio.com}{Rstudio}`
* `\email{hadley@@rstudio.com}` (note the doubled `@`)
### Lists
* Ordered (numbered) lists:
```{r}
#' \enumerate{
#' \item First item
#' \item Second item
#' }
```
* Unordered (bulleted) lists
```{r}
#' \itemize{
#' \item First item
#' \item Second item
#' }
```
* Definition (named) lists
```{r}
#' \describe{
#' \item{One}{First item}
#' \item{Two}{Second item}
#' }
```
### Mathematics
Standard LaTeX (with no extensions):
* `\eqn{a + b}`: inline equation
* `\deqn{a + b}`: display (block) equation
### Tables
Tables are created with `\tabular{}`. It has two arguments:
1. Column alignment, specified by letter for each column (`l` = left, `r` = right,
`c` = centre.)
2. Table contents, with columns separated by `\tab` and rows by `\cr`.
The following function turns an R data frame into into the correct format. It ignores column and row names, but should get you started.
```{r}
tabular <- function(df, ...) {
stopifnot(is.data.frame(df))
align <- function(x) if (is.numeric(x)) "r" else "l"
col_align <- vapply(df, align, character(1))
cols <- lapply(df, format, ...)
contents <- do.call("paste",
c(cols, list(sep = " \\tab ", collapse = "\\cr\n ")))
paste("\\tabular{", paste(col_align, collapse = ""), "}{\n ",
contents, "\n}\n", sep = "")
}
cat(tabular(mtcars[1:5, 1:5]))
```