Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some more tips for R package creation guideline #17

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 41 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ The purposes of this guide are:
* To try to make sure [Leek group](http://www.biostat.jhsph.edu/~jleek/) software has a consistent design.<sup>1</sup>



Why develop an `R` package?
--------------------

Expand Down Expand Up @@ -63,6 +62,7 @@ To start writing an `R` package you need:
* To read [Hadley's intro](http://adv-r.had.co.nz/Package-basics.html).
* To read [Karl's Github tutorial](http://kbroman.github.io/github_tutorial/).


Naming your package
---------------------

Expand Down Expand Up @@ -156,6 +156,7 @@ git push -u origin master

Once you're familiar with basic git and GitHub workflows, GitHub has some more advanced features that you can take advantage of. In particular, [github flow](http://scottchacon.com/2011/08/31/github-flow.html) is an excellent way to manage contributions, and [GitHub organizations](https://github.com/blog/674-introducing-organizations) can provide a central location for multiple people (e.g. in a lab) to collaborate on software.


The parts of an `R` package
--------------------

Expand All @@ -171,7 +172,6 @@ it would be called `leek-class.R`. If you are defining a new method for the clas
be named _newclass-methodname-method.R_. For example, a plotting method for the leek class would go in a `.R` file
called _leek-plot-method.R_.


### `DESCRIPTION `

The `DESCRIPTION` file is a plain text file that gets generated with the `devtools::create` command.
Expand Down Expand Up @@ -214,14 +214,44 @@ Description: A couple sentences that expand the title
License: Artistic-2.0
```

### `NAMESPACE `

It is a plain text file that have all the functions, methods and classes exported. As well, it needs to
have all functions that your pakcage depends on and it should match what you have in the DESCRIPTION file.

This file can be created with roxygen2 automatically but some times you need to do it yourself. An example is:

```
S3method(as.character,expectation)
S3method(compare,character)
export(auto_test)
export(auto_test_package)
export(colourise)
export(context)
exportClasses(ListReporter)
exportClasses(MinimalReporter)
importFrom(methods,setRefClass)
useDynLib(testthat,duplicate_)
useDynLib(testthat,reassign_function)
```

There is more information in [Hadley](http://had.co.nz/) [guide](http://r-pkgs.had.co.nz/namespace.html). In general,
it is good to export and import exactly what you need. That will do your package as lightly as possible, and avoid
collision name problems among functions/methods of other packages.

If you have ever collision name problems, this package can help to know what functions from what package exactly are you using:
[codetoolsBioC](https://hedgehog.fhcrc.org/gentleman/bioconductor/trunk/madman/Rpacks/codetoolsBioC) (via svn with username and password readonly).

Coding style requirements
---------------------

I will try to keep the stylistic requirements minimal because they would likely drive you nuts. For now there are:


1. Your indent should be 4 spaces on every line
2. Each line can be no more than 80 columns
3. Use `<-` for asignament. Even if R understands `=`, it is a community agrenment to use the other one
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: agreement

Although note that Jeff would not say something like this. There is no universal agreement to use <- although I personally do.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! it is true. It was I got from BioConductor as feedback. Maybe should go to that section.

4. Some people like to name internal functions that are not exported with a `.` at the beggining
5. [linr](https://github.com/jimhester/lintr) helps you with many of that. Is not perfect, but will make your life easy detecting coding style issues.


You can set these as defaults (say in Sublime or RStudio) then you don't have to worry about them anymore. If you find
Expand All @@ -245,11 +275,11 @@ use it and you will have a positive impact on the world.

Documentation has two main components. The first component is help files, which go into the `man/` folder. The second
component is vignettes which will go in a folder called `vignettes/ which you will have to create. I'll tackle
each of these separately.
each of these separately. As a rule, the user should be able to use the package reading only the MAN files, and the vignettes
may be a complete or summary of what you can do with the package.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A vignette is a crucial component of a Bioconductor package. That fact should not be understated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will remove that part then, and focus on the importance of the MAN pages.


### Help (man) files


These files document each of the functions/methods/classes you will expose to your users. The good news is that
you don't have to write these files yourself. You will use the [roxygen2](http://cran.r-project.org/web/packages/roxygen2/index.html) package to create the man files. To use
[roxygen2](http://cran.r-project.org/web/packages/roxygen2/index.html) you will need to document your functions in the `.R` files with comments formatted in a specific way. Right
Expand All @@ -272,6 +302,7 @@ the following way:
#'
#' @export
#'
#' @importFrom methods setClass setGeneric setMethod setRefClass
#' @examples
#' R code here showing how your function works

Expand All @@ -295,6 +326,8 @@ on the package folder. The package folder must be in the current working directo
Please read [Hadley](http://had.co.nz/)'s [guide](http://adv-r.had.co.nz/Documenting-functions.html) in its entirety to understand how to document packages and in particular, how [roxygen2](http://cran.r-project.org/web/packages/roxygen2/index.html)
deals with [collation](http://cran.r-project.org/doc/manuals/R-exts.html#The-DESCRIPTION-file) and [namespaces](http://cran.r-project.org/doc/manuals/R-exts.html#Package-namespaces).

Normally, only functions, classes and methods that are exported are documented. Otherwise, the users will see documentation for
functions that they can not use.

### Vignettes

Expand Down Expand Up @@ -328,13 +361,13 @@ See the [BiocStyle vignette](http://www.bioconductor.org/packages/devel/bioc/vig
commands that you can use when creating your vignette (e.g. `\Biocpkg{IRanges}` for referencing a [Bioconductor](http://www.bioconductor.org/) package).



Who should be an author?
---------------------

For our purposes anyone who wrote a function exposed to users in the package (a function that has an @export in the documentation)
will be listed as an author.


Who should be a maintainer
---------------------

Expand Down Expand Up @@ -363,6 +396,7 @@ actively updated (on [GitHub](https://github.com/)/[Bioconductor](http://www.bio
packages routinely maintain important packages (like [Hadley](http://had.co.nz/), [Yihui](http://yihui.name/), [Ramnath](https://github.com/ramnathv), [Martin Morgan](http://www.fhcrc.org/en/util/directory.html?q=martin+morgan&short=true#peopleresults), etc.). Keep in mind your
commitment (see below) when making decisions about whose functions to use.


Simple >>>> Complex
---------------------

Expand Down Expand Up @@ -416,7 +450,6 @@ This means that whenever you run `R CMD check` on your package, you will get an
is important, because it will be one way for me to check your code and to tell you if you have broken your code
with an update.


### An example unit test

Here is an example unit test, so you get the idea of what I'm looking for. Below is a function for performing
Expand Down Expand Up @@ -513,9 +546,6 @@ you should be testing:
* When you know what the answer should be - you get it





Dummy proofing
---------------------

Expand All @@ -535,7 +565,6 @@ with the `stop` function at the beginning of each function that throw errors if
class or would result in silly output (like the zero variance gene test in the unit testing example above).



Releasing to [Bioconductor](http://www.bioconductor.org/)
---------------------

Expand All @@ -555,7 +584,7 @@ library("devtools")
check_doc("packagename") ## Only for checking the documentation
system.time(check("packagename")) ## R CMD check with time information
```

* Check your package passes [BiocCheck()](https://www.bioconductor.org/packages/release/bioc/html/BiocCheck.html)
* Update the version number and push to [GitHub](https://github.com/). In the commit comments, state it is the version being
pushed to [Bioconductor](http://www.bioconductor.org/).
* Send an email as described in the checklist stating that you want an account and want to submit a package.
Expand Down