Gaggle Components
Geese
Data Standards
Boss

Gaggle Genome Browser

Genome Browser / R interoperability

Supporting code for integrating R and the Genome Browser is in the script genome.browser.support.R. A development version is available by entering the following command at the R prompt:

source("http://gaggle.systemsbiology.net/R/genome.browser.support.R")

Setup

The genome.browser.support.R script requires that a few libraries be installed on your system, namely MeDiChI, RSQLite, and the Gaggle R Goose. The script will attempt to automatically install RSQLite. MeDiChI and gaggle must be manually installed.

Note: Make sure you have a recent version of the gaggle package from Bioconductor. For this demo, a version of at least 1.16 should work. See the instructions for installing the Gaggle package for R.

Documentation

R docs here.

Example

Sending data to the genome browser

  1. For purposes of this demo, we need an instance of the GGB running. We also need a genome loaded. For example, you can may want to use the E. coli K12 genome. To do that, start the genome browser by clicking here. Select File|New Dataset from the menu, start typing 'Escher...' then double-click on Escherichia coli str. K12 substr. W3110 in the chooser. Click OK.

  2. Start R, load genome.browser.support.R, and connect R-goose to Gaggle. The script will try to load several libraries. See setup above.

    source("http://gaggle.systemsbiology.net/R/genome.browser.support.R")
    gaggleInit()
  3. Send dataset description from GB to R.

    In the GB open the Gaggle Toolbar. In the gaggle data drop-down select "Description of Dataset..." and click the broadcast button. In R, type:

    ds <- getDatasetDescription()

    This returns a nested list structure describing the dataset and its sequences and tracks. One piece of information it contains will allow us to make direct connections from R to the SQLite database holding our genome browser dataset (namely, ds$filename).

  4. Extract a list of sequence names from the dataset description.

    getSequenceNames(ds)
  5. Get the length of the chromosome, which for e. coli is named chr. This is done by drilling down into the ds data structure.

    len <- ds$sequences$chr$length
  6. Generate some bogus data. Or, insert your data here!

    starts <- seq(1,len,100)
    track.fwd <- data.frame(sequence='chr',
                            strand='+',
                            start=starts,
                            end=starts+99,
                            value=sin(starts/1000.0))
    track.rev <- data.frame(sequence='chr',
                            strand='-',
                            start=starts+49,
                            end=starts+148,
                            value=sin(starts/900.0))
    track <- rbind(track.fwd, track.rev)

    Type head(track) to see a data.frame in the format required by addTrack().

  7. Create some attributes for our new track. This is optional as settings can be changed from inside the genome browser as well.

    attr <- list(color='0x804B0082',source='Finklestein, et al. 2009', top=0.20, height=0.15, viewer='Scaling', group='bogus data')
  8. Finally, we're ready to add our new track, which we're naming waves, to the genome browser.

    addTrack(ds, track, name='waves', attributes=attr)

    A pair of purple sine waves should appear, one for the positive strand the other for the reverse strand.

© 2006, Institute for Systems Biology, All Rights Reserved
validate