Document toolboxDocument toolbox

R Package Builds

SageBio R Package Build Host

VMs running on georgetown.fhcrc.org
Ubuntu Linux: sagebuild-lnx.fhcrc.org
OSx 10.5 Leopard: sagebuild-osx.fhcrc.org
Windows 7 Home: sagebuild-win.fhcrc.org

All used the same credentials:
username: sagebio

Build R Packages for each Platform

Windows Package Builds

This must be done on a 64 bit Windows host. It will not correctly build for both 32bit and 64bit on a 32bit Windows Host.

  1. Go to the AWS console and start the Windows Build Host
  2. Remote Desktop to the host
  3. As appropriate, from the R prompt, install.packages any additional packages you need.
  4. Build the packages

    cd c:\Users\sagebio\Documents
    svn up rSynapseClient
    svn up predictiveModeling
    cd 2.15
    R CMD build ..\rSynapseClient
    R CMD build ..\predictiveModeling
    
    R CMD INSTALL --build --merge-multiarch synapseClient_#.#-#.tar.gz
    R CMD INSTALL --build --merge-multiarch predictiveModeling_#.#-#.tar.gz
  5. Look inside the resulting zip and make sure both 32bit and 64bit builds are there

    ./rSynapseClient/libs/i386
    ./rSynapseClient/libs/i386/synapseClient.dll
    ./rSynapseClient/libs/x64
    ./rSynapseClient/libs/x64/synapseClient.dll
  6. Repeat steps 4 and 5 using R 2.13 and R 2.14, directing the built packages to directories names "2.13" and "2.14", respectively. NOTE: since the synapseClient package contains C source code, a package can only be loaded on the version of R that was used to build it. So, a package built using R 2.13 can't be loaded in R 2.14 and vice-versa.
  7. SCP the directories containing the zip files somewhere you can reach from that EC2 host OR upload them to your S3 bucket in your personal S3 account.

See also: http://win-builder.r-project.org/

Mac OSX Package Builds

This must be done on a Mac OSX Leopard 10.5 host. It does not build binaries that work for 10.5 if built on Snow Leopard 10.6. Xa has a Leopard machine. Ask him for an account.

  1. As appropriate, from the R prompt, install.packages any additional packages you need.
  2. Build the packages
     

    svn up rSynapseClient
    svn up predictiveModeling
    R CMD INSTALL --build rSynapseClient
    R CMD INSTALL --build predictiveModeling
    
  3. Repeat step 2 for R 2.13 and R 2.14, directing the built packages to directories names "2.13" and "2.14", respectively. NOTE: since the synapseClient package contains C source code, a package can only be loaded on the version of R that was used to build it. So, a package built using R 2.13 can't be loaded in R 2.14 and vice-versa.

Source Package Builds

You can do this either on the Mac or on Windows, it doesn't matter  (<<< You may be able to run it on Unix too.)

  1. As appropriate, from the R prompt, install.packages any additional packages you need.
  2. Build the packages

    svn up rSynapseClient
    svn up predictiveModeling
    R CMD build rSynapseClient
    R CMD build predictiveModeling
    
  3. Make two copies of the tar.gz file, putting one into a directory called "2.13" and the other into a directory called "2.14"

Deploy packages to our CRAN server

Ask Brian Holt for an account on depot.sagebase.org with sudo permissions. Note that you might need to reformat your version naming scheme.  If the version number contains three digits (e.g. snm_1.0-2.tar.gz), you need to separate the second and third digit using a '-' instead of a '.' (for example, snm_1.0-2.tar.gz instead of snm_1.0.2.tar.gz).

  1. Combine the files from the source, Windows and OSx builds together based on their distribution. You will now have two directories (one named "2.13" and the other named "2.14"), each with 3 files.
  2. scp the two directories to  appropriate stack directory
    • Staging Packages:

      scp -r 2.13 ndeflaux@depot.sagebase.org:/home/sagebio/sagebioRPackages/staging/2.13
      scp -r 2.14 ndeflaux@depot.sagebase.org:/home/sagebio/sagebioRPackages/staging/2.14
      
    • Prod Packages:

      scp -r 2.13 ndeflaux@depot.sagebase.org:/home/sagebio/sagebioRPackages/prod/2.13
      scp -r 2.14 ndeflaux@depot.sagebase.org:/home/sagebio/sagebioRPackages/prod/2.14
      
  3. ssh to depot.sagebase.org
  4. run the script to install them into our CRAN repository directory structure
    • Staging Deployment: sudo /home/sagebio/sagebioRPackages/rPackageBuildTools/deployAllPackages.sh staging
    • Prod Deployment: sudo /home/sagebio/sagebioRPackages/rPackageBuildTools/deployAllPackages.sh prod
  5. Then from the R prompt install your package(s) on your local machine (and other people's machines) in the following manner and check that all is good
    • Install Staging Packages

      source('http://depot.sagebase.org/CRAN.R'); pkgInstall(c("synapseClient", "predictiveModeling"), stack='staging')
      
    • Install Prod Packages

      source('http://depot.sagebase.org/CRAN.R'); pkgInstall(c("synapseClient", "predictiveModeling"))
      

You can find the code for these scripts in SVN here http://sagebionetworks.jira.com/source/browse/~raw,r=HEAD/PLFM/trunk/tools/rPackageBuildTools/

First Time Setup

Setting up a Mac OSX R Package Build Server from scratch

  • This has to be done on a host running Leopard Mac OSX 10.5 if you want the builds to work on both Leopard and Snow Leopard
  • Install XCodeso that you have a C compiler
    • Be sure to select installation option "UNIX Development Support"
  • Install MacTex

Setting up a Windows R Package Build Server from scratch

  • Choose a 64 bit EC2 Windows image
  • Save the administrator password in sodo:/work/platform/PasswordsAndCredentials/PlatformAWSCredentials
  • Install Chrome
  • Install SilkSVN
  • Install WinSCP
  • Install R
  • Install RTools
  • Install MikTex
  • Turn on Quick Edit Mode in properties for cmd

    mkdir c:\SageBioRPackageSource
    cd c:\SageBioRPackageSource
    svn co https://sagebionetworks.jira.com/svn/PLFM/trunk/client/rSynapseClient
    svn co https://sagebionetworks.jira.com/svn/CLP/trunk/predictiveModeling
    

Set up a CRAN server for a new stack

We are currently using depot.sagebase.org which has a web server on it that is public-facing.

  1. Create the CRAN directory structure under the webserver root

    mkdir /srv/www/htdocs/Foswiki/CRAN/<theStackName>
    /home/sagebio/sagebioRPackages/rPackageBuildTools/deployAllPackages.sh <theStackName> create
    
  2. Create a directory in which to hold R packages to be deployed

    mkdir -p /home/sagebio/sagebioRPackages/<theStackName>
    
  3. If needed, update the R package build tools

    svn up https://sagebionetworks.jira.com/svn/PLFM/trunk/tools/rPackageBuildTools /home/sagebio/sagebioRPackages/rPackageBuildTools
    

Notes from Dan Tennenbaum

A Scratch CRAN Repository

Hi,

I'll start at the end with a description of an R package repository.

First, here is the directory layout, not showing any files. You'll
need to replicate this layout somewhere in the document root of a web
server where you will be distributing your packages.

biocadmin@merlot2:/loc/www/bioconductor-test.fhcrc.org> tree -d scratch-repos/
scratch-repos/
|-- 2.12
|   |-- bin
|   |   |-- macosx
|   |   |   `-- leopard
|   |   |       `-- contrib
|   |   |           `-- 2.12
|   |   |-- windows
|   |   |   `-- contrib
|   |   |       `-- 2.12
|   |   `-- windows64 -> windows/
|   `-- src
|       `-- contrib
|-- 2.13
|   |-- bin
|   |   |-- macosx
|   |   |   `-- leopard
|   |   |       `-- contrib
|   |   |           `-- 2.13
|   |   |-- windows
|   |   |   `-- contrib
|   |   |       `-- 2.13
|   |   `-- windows64 -> windows/
|   `-- src
|       `-- contrib
`-- 2.14
   |-- bin
   |   |-- macosx
   |   |   `-- leopard
   |   |       `-- contrib
   |   |           `-- 2.14
   |   |-- windows
   |   |   `-- contrib
   |   |       `-- 2.14
   |   `-- windows64 -> windows/
   `-- src
       `-- contrib

36 directories

The top level, where it has directories like 2.12, 2.13, 2.14, is not
strictly speaking, part of the standard CRAN repository layout. It is
a modification we have made to it. You will notice, looking further
down the tree, that the bin directories have a leaf directory
corresponding to an R version, but the src directory does not.

Without our modification, this would mean that different versions of
the same package, meant to be run with different versions of R, exist
in the same directory. This can cause problems. That's why we put a
top level directory named with the version of R.

Now here is the same tree command, also showing files. I'll remove
references to files that are not germane to this discussion:

biocadmin@merlot2:/loc/www/bioconductor-test.fhcrc.org> tree scratch-repos/
scratch-repos/
|-- 2.12
|   |-- bin
|   |   |-- macosx
|   |   |   `-- leopard
|   |   |       `-- contrib
|   |   |           `-- 2.12
|   |   |               |-- PACKAGES
|   |   |               |-- PACKAGES.gz
|   |   |               `-- StudentGWAS_0.0.4.tgz
|   |   |-- windows
|   |   |   `-- contrib
|   |   |       `-- 2.12
|   |   |           |-- PACKAGES
|   |   |           |-- PACKAGES.gz
|   |   |           `-- StudentGWAS_0.0.4.zip
|   |   `-- windows64 -> windows/
|   |-- src
|   |   `-- contrib
|   |       |-- PACKAGES
|   |       |-- PACKAGES.gz
|   |       `-- StudentGWAS_0.0.4.tar.gz
|   `-- update-repo.R ->
/loc/www/bioconductor-test.fhcrc.org/course-packages/update-course-repo.R
|-- 2.13
|   |-- bin
|   |   |-- macosx
|   |   |   `-- leopard
|   |   |       `-- contrib
|   |   |           `-- 2.13
|   |   |               |-- Bioconductor_0.99.100.tgz
|   |   |               |-- PACKAGES
|   |   |               |-- PACKAGES.gz
|   |   |               |-- Rsamtools_1.3.27.tgz
|   |   |               |-- StudentGWAS_0.0.4.tgz
|   |   |               `-- biocLite_0.99.35.tgz
|   |   |-- windows
|   |   |   `-- contrib
|   |   |       `-- 2.13
|   |   |           |-- Bioconductor_0.99.100.zip
|   |   |           |-- PACKAGES
|   |   |           |-- PACKAGES.gz
|   |   |           |-- Rsamtools_1.3.27.zip
|   |   |           |-- StudentGWAS_0.0.4.zip
|   |   |           `-- biocLite_0.99.35.zip
|   |   `-- windows64 -> windows/
|   |-- src
|   |   `-- contrib
|   |       |-- Bioconductor_0.99.100.tar.gz
|   |       |-- PACKAGES
|   |       |-- PACKAGES.gz
|   |       |-- Rsamtools_1.3.27.tar.gz
|   |       |-- StudentGWAS_0.0.4.tar.gz
|   |       `-- biocLite_0.99.35.tar.gz
|   `-- update-repo.R ->
/loc/www/bioconductor-test.fhcrc.org/course-packages/update-course-repo.R
|-- 2.14
|   |-- bin
|   |   |-- macosx
|   |   |   `-- leopard
|   |   |       `-- contrib
|   |   |           `-- 2.14
|   |   |               |-- Annotations_1.0.4.tgz
|   |   |               |-- BasicFlowWorkshop_1.0.tgz
|   |   |               |-- ChipSeq_1.0.0.tgz
|   |   |               |-- PACKAGES
|   |   |               |-- PACKAGES.gz
|   |   |               |-- RCytoscape_1.3.4.tgz
|   |   |               |-- StudentGWAS_0.0.4.tgz
|   |   |               |-- VariantAnnotation_0.99.7.tgz
|   |   |               `-- rcyTutorial_1.0.3.tgz
|   |   |-- windows
|   |   |   `-- contrib
|   |   |       `-- 2.14
|   |   |           |-- Annotations_1.0.4.zip
|   |   |           |-- BasicFlowWorkshop_1.0.zip
|   |   |           |-- ChipSeq_1.0.0.zip
|   |   |           |-- PACKAGES
|   |   |           |-- PACKAGES.gz
|   |   |           |-- RCytoscape_1.3.4.zip
|   |   |           |-- StudentGWAS_0.0.4.zip
|   |   |           |-- VariantAnnotation_0.99.7.zip
|   |   |           |-- graph_1.31.1.zip
|   |   |           `-- rcyTutorial_1.0.3.zip
|   |   `-- windows64 -> windows/
|   |-- src
|   |   `-- contrib
|   |       |-- Annotations_1.0.4.tar.gz
|   |       |-- BasicFlowWorkshop_1.0.tar.gz
|   |       |-- ChipSeq_1.0.0.tar.gz
|   |       |-- PACKAGES
|   |       |-- PACKAGES.gz
|   |       |-- RCytoscape_1.3.4.tar.gz
|   |       |-- StudentGWAS_0.0.4.tar.gz
|   |       |-- VariantAnnotation_0.99.7.tar.gz
|   |       `-- rcyTutorial_1.0.3.tar.gz
|   `-- update-repo.R
`-- biocLite.R

36 directories, 59 files
biocadmin@merlot2:/loc/www/bioconductor-test.fhcrc.org>

So as you can see, .tar.gz files (source packages) go in src/contrib,
.tgz (mac binary packages) go in bin/macosx/leopard/contrib/$R_VER
(let's just make $R_VER equal to 2.13 for the purposes of this
dicussion), .zip files (windows binary packages) go in
bin/windows/contrib/2.13. bin/windows64 is a symlink to
bin/windows--unless you really want to give 64-bit windows users a
different package than 32 bit users.

The PACKAGES and PACKAGES.gz files (identical except PACKAGES.gz is
gzipped) provide information to install.packages(). For example, you
may have several versions of the same package in a repository, the
PACKAGES file will point to the most recent one.

The contents of a PACKAGES file look like this:
http://bioconductor.org/packages/2.8/bioc/bin/windows/contrib/2.13/PACKAGES

You can generate this files from within R using the
tools:::write_PACKAGES() function.
see ?tools:::write_PACKAGES for help.
The only flags you really need to worry about are dir and type.
What I typically do is go to the ...src/contrib directory, start up R, and type:
tools:::write_PACKAGES(".", type="source")
then go to the bin/macosx/leopard/contrib/2.13 directory, start up R, and type:
tools:::write_PACKAGES(".", type="mac.binary")
then go to the bin/windows/contrib/2.13 directory, start up R, and type:
tools:::write_PACKAGES(".", type="win.binary")


of course it is easy to simplify/automate this and reduce the number
of steps. You'll want to do this every time you post updated packages
to your repository. Otherwise the updated packages will not be seen by
install.packages().

How to allow the user to install packages from this
repository...create a file like:
http://bioconductor.org/scratch-repos/pkgInstall.R

and have the user source it. If you like, you can also use this to
ensure they are running an appropriate version of R.
You might want to modify this script a bit--right now, if a user is
running R-2.14, this script won't allow them to find your packages.
You might want to hardcode "2.13" instead of rversion in the line that
starts "scratchRepos <-".

Then they run pkgInstall, a wrapper around install.packages().

Now, backing up to package building, I was able to build a bi-arch
version of synapseClient on windows as follows:

C:\Users\dtenenba\Downloads>"\Program Files\R\R-2.13.1\bin"\R CMD
INSTALL --build synapseClient

I can confirm that it is bi-arch as follows:
C:\Users\dtenenba\Downloads>unzip  -l synapseClient_0.9.1.zip|grep dll
   10752  08/18/11 13:00   synapseClient/libs/i386/synapseClient.dll
   14336  08/18/11 13:00   synapseClient/libs/x64/synapseClient.dll

I was also able to "library(synapseClient)" with both
R --arch i386
and
R --arch -x64

This was on a 64-bit Windows Server 2008 machine. If you get different
results on a 32-bit windows machine we can look at it together
tomorrow, but I might suggest playing with some of the flags shown by
R CMD INSTALL --help, especially --force-biarch or maybe
--merge-multiarch.

As for mac builds, I think you can ignore what I said on the phone, the command
R CMD INSTALL --build synapseClient
*appears* to build a bi-arch version of the package:

dhcp151060:sage dtenenba$ tar ztf synapseClient_0.9.1.tgz |grep "\.so\$"
synapseClient/libs/i386/synapseClient.so
synapseClient/libs/x86_64/synapseClient.so

Plus I was able to "library(synapseClient)" in both
R --arch i386
and
R --arch x86_64

Hope this helps...
Dan

one quick clarification--you won't need to set up the 2.12 and 2.14
directories, since everyone will be running R-2.13 or at least
installing packages built with R-2.13.
I do recommend putting a 2.13 directory at the top level, though,
since later you may give courses with different versions of R and you
don't want to run into the problem I describe below about the
non-versioned src/contrib directory. I can explain more clearly
tomorrow...
Dan

Unix Builds

On Linux and other Unices, there are some prerequisites, and these are
best installed by using the package manager that comes with the
distribution (apt-get, zypper, etc.) The names of the packages you
need vary from one distribution to the next.
There is a partial list for Ubuntu here, under Materials:
http://bioconductor.org/help/course-materials/2011/BioC2011/

For some Linux distros you can do a complete R install just using a
package manager; see the links under:
http://cran.fhcrc.org/bin/linux/

I recommend trying this out on your own to get a feel for it. If there
is a website associated with your class, post as much of this info as
you can. We tend to have 2 or 3 roving support people wandering around
the room trying to resolve issues that come up without forcing the
presenter/teacher to get derailed.

Leopard versus Snow Leopard

---------- Forwarded message ----------
From: Simon Urbanek <simon.urbanek@r-project.org>
Date: 2010/11/23
Subject: Re: building on Snow Leopard
To: Dan Tenenbaum <dtenenba@fhcrc.org>
Cc: Hervé Pagès <hpages@fhcrc.org>


Hi Dan,

sorry I missed that e-mail since I was traveling abroad (BioC devel
meeting in Heidelberg).

On Nov 18, 2010, at 4:50 PM, Dan Tenenbaum wrote:

> Hi Simon,
>
> First of all, let me introduce myself--I'm the newest member of the Bioconductor team and will be helping with the build system, among other things.
>

Great!


> I understand that you are the expert on building R packages for Mac, and I had a question for you.
>
> As you can see below, we are beginning to build bioconductor packages experimentally on a Snow Leopard machine. We are not going to distribute build products from this machine until we are totally sure they will work for all Mac users.
>
> Can you think of any pitfalls we should be aware of? Will packages built on Snow Leopard work without issues on Leopard? Does CRAN use Snow Leopard to build Mac binaries? If not, why not? We probably won't do it unless and until CRAN does it.
>

Unfortunately SL uses different load commands in binaries such that
those are not recognized on 10.5 and earlier. Also library versions
are higher in SL so the same issues that existed when building 10.4
binaries on 10.5 still apply as well.

Apple provides a way to cross-compile for older OS versions by using
corresponding SDKs which contain old libraries plus flags to enforce
backwards compatibility in binaries. So for example you can use
-sysroot /Developer/SDKs/MacOSX10.5.sdk -mmacosx-version-min=10.5
flags to compile and link for 10.5. So one way to do that would be to
modify all the CC, CXX, FC, etc. flags to use it. However, that won't
always work with external libraries, so that's why I used the extra
compiler driver as mentioned in my previous e-mail. (BTW don't forget
to symlink /usr/local inside the SDK otherwise it won't be found).

The real problem, though, is that you have no easy way to check the
result. You can't really check whether the resulting code works on
10.5 unless you really run it on 10.5. That's why I still have 10.5 on
the build machine as it makes my life much easier ;). The only reason
I was hacking the compilers for 10.6 was so that I can test the
production of 10.5 binaries on my 10.6 desktop.

As I said, it would be possible to provide a specific SL binary of R,
but for me so far the benefits don't outweigh the maintenance cost - I
was quite happy to drop 10.4 support when we had boht 10.5 and 10.4
builds in parallel, so I'm not keep on re-introducing that feature ;).
However, I may possibly provide an unofficial binary on the
r.research.att.com site -- I'm not 100% sure about that yet, but it's
an option.

Cheers,
Simon



>
> ---------- Forwarded message ----------
> From: Dan Tenenbaum <dtenenba@fhcrc.org>
> Date: Thu, Nov 18, 2010 at 1:39 PM
> Subject: experimental Snow Leopard builds
> To: bioc-devel@stat.math.ethz.ch
>
>
> You may have noticed that there is a new machine listed in the build reports for the development version of Bioconductor:
>
> http://bioconductor.org/checkResults/devel/bioc-LATEST/
>
> The machine is called "petty" and is running Snow Leopard (Mac OS X 10.6.4).
>
> It is important to note that this machine is NOT building the Mac binary packages that you download from our website or with biocLite().
> Those are still produced by "pelham" (running Leopard, Mac OS X 10.5.8).
>
> We will not use "petty" to generate the Mac binary packages we distribute, until we are absolutely certain that it is safe to do so, that the packages build correctly and will run on all Macs running Leopard or Snow Leopard.
>
> You may notice in the build report that some packages build without errors on pelham but not on petty. This is likely due to configuration issues on petty and is probably not a problem with the package itself. We will be smoothing out these rough edges over the next few weeks.
>
> Please let us know if you have any questions or concerns.
>
> Thanks,
> Dan / Bioconductor team
>
>