Gaggle Components
Data Standards

Meta Data Loader

The Meta Data Loader is a software application that, with user input, formats tabular scientific data into EMI-ML. It is designed as an entry point to the suite of software that is being developed within the Gaggle.


The Meta Data Loader software, including source code, is freely available under LGPL. Although not required for the Data Loader, it is recommended that users follow the Gaggle prerequisites, to ensure complete functionality.

Software Download

The Data Loader software can be retrieved as an executable JAR file or directly from a subversion repository.

JAR File: The executable file can be found here (right click to "Save Link As...").

Java Webstart: A JNLP version of the data loader can be found here.

Source Code: The source code for the Meta Data Loader is stored within the SVN repository for the Gaggle. Please click here for code access details.

Please contact Michael Johnson for any comments or concerns regarding the use of this software. Thanks!

Demonstration: Systems Analysis of Halobacterium NRC-1 responses to metal stress

This demonstration illustrates the initial phases of data loading and investigation using the Gaggle computational tools

Laboratory Introduction

Note: these data were used in the Baliga lab publicaton, "A systems view of haloarcheal strategies to withstand stress from transition metals"

The primary question that is being addressed is, "When the halobacterium systems is perturbed, what systems are affected and how?" To answer this, we have chosen to perform a series of microarray experiments and collect the output data from the microarray scanner. Within the laboratory, we have kept the following items constant:

  1. All samples were collected using a benchtop fermentor.
  2. Halobacterium cells are grown in 1.5L of media to mid log phase (OD ~0.6 - 0.7).
  3. The pH is held at 7.0 (by automatically pumping in 0.5N H2SO4 or 0.5M NaOH).
  4. The temperature is set at 37.0C.
Dissolved oxygen (dO2) will be monitored with an O2 sensor.

For this particular example, we will vary concentration of Fe2+ in the growth medium.

Data Transformation

  1. Download the Meta Data Loader via Java Web Start or download the JAR file directly.
  2. Download iron response data:
    1. a data file of log 10 ratios of mRNA expression
    2. a data file of confidence values
  3. Download the experimental constants
  4. Start the Meta Data Loader

    (on Mac OS X)
  5. Click on "Import File..." and select the metal response data from the file chooser
    • This is a tab-delimited file. Press "OK" when prompted with choosing the delimiter.

    (on Mac OS X)
  6. A snapshot of the data (first 50 lines) should now be visible. Click on the first data column to identify the column that contains the row of identifiers. Click on the "Set IDs" button, at which point the ID column will be highlighted.
    • Note that these should be a unique list of values.
  7. Next, the list of experiment condition names is identified. Click on the second row, which contains "GENE", followed by experiment names. Click the "Conditions" button, at which point the Conditions row will be highlighted.

    (on Mac OS X)
  8. At this point, we have a list of genes and list of experiments, but need to assign data to each experiment-gene combination. To accomplish this, we need to identify the data type that are in our file. Drag your mouse over the four columns of numerical data (ratios) and then click the "+" button at the bottom of the data table. This indicates that you wish to add these data. A prompt wil ask you to identify the data type. These are log 10 ratios. On the left panel of the Meta Data Loader, a list of experimental conditions should appear with "Log 10 Ratios" under each.

    (on Mac OS X)
  9. Repeat the steps 5 - 8, but identify the data type as "Lambdas" (confidence values). Afterwards, each experimental condition will have "Log 10 Ratios" and "Lambdas".
    • The data have been identified, but now we need to add the experimental context to these experiments

    (on Mac OS X)
  10. Click "next>>" at the bottom-right corner.
    • A prompt will ask which data is the "primary" data. Since lambda values are confidence scores assigned to the log 10 ratios, select "Log 10 Ratios"

    (on Mac OS X)
  11. Next you will place the dataset in a catalog of experiments organized by experimental condition.
    • In the tree view, select environmental, metals.
    • Click the insert button.
    • Enter 'sample' at the prompt.

  12. (on Mac OS X)

    If you would like to look at all of the data conditions that are available, press the "+" button at the bottom of the data tree. This will exapnd the tree.

    (on Mac OS X)
  13. Click "next >>"
    • You will (intentionally) be notified that there is a conflict. Each experiment needs to have unique names so that the data can be identified and integrated accurately in the Data Matrix Viewer and downstream software.
    • Type "Fe_sample1" and press enter
    (on Mac OS X)

    Finally, add the factors.

  14. Click on "Load Factors..." and select the sample experiment variables that you previously downloaded. This will load a standardized list of variables that can be used from one experiment to the next.

    (on Mac OS X)

  15. Click on "Fe" cell for Fe_sample1 to highlight it. Then, click the "Change Factor Value..." button above the table of variables.
    • enter "2000" into dialog box
    (on Mac OS X)

  16. Click on the "Fe" values for samples 2 - 4 and give them values of 4000, 6000, and 7000.
  17. Also, suppose a new variable needs to be added. For example, we would like to capture the rotation speed of the agitator within our fermentor. To accomplish this, click on the "Add New Factor..." button. Enter the following information

    • New Factor Name: Agitation
    • Factor Units: RPM
    • Default Value: 60
    (on Mac OS X)

    Now, at the bottom of the list of variables, Agitation now appears with a default value of 60 RPM.

    (on Mac OS X)

  18. Click next. You will be asked to name your expirement. Enter 'sample'.
  19. Select a directory in which to save the output.
  20. Finished!

Data Analysis using the Data Matrix Viewer

  1. Start Gaggle Boss
    • Start the Gaggle Boss first.
  2. Start DMV - The Data Matrix viewer gives a spreadsheet-like view of tabular data.
    • Load your microarray data:
      1. Choose File > Load > Open New Repository (Local File).
      2. Browse to the directory created in the data loader and click Open.
      3. The data values should now be visible in the DMV allowing plotting, selection of differentially expressed genes, and broadcast to other gaggle components for further analysis.
      4. For more information on using DMV go here.
  3. To start other gaggle components, visit the blank slate page.
© 2006, Institute for Systems Biology, All Rights Reserved