JSON/Script Goose

What It Is

A Gaggle goose which can be run from the command line or from scripts written in any language.

I refer to it interchangeably as the Script Goose and the JsonGoose.

This is a prototype that will probably not be developed further. We do intend to have a support for communicating with Gaggle using JSON but it will likely take a different form.

How It Works

This goose is an ordinary goose written in Java which communicates with the Gaggle via the RMI protocol.

This goose contains an embedded HTTP server which exposes boss and goose methods via the JSON-RPC specification. The jabsorb implementation of JSON-RPC is used. JSON (pronounced “Jason”) stands for JavaScript Object Notation and is a lightweight protocol for describing objects in textual format. It is the native format for object literals in JavaScript and ActionScript and has parsers for many languages, including R.

Any program can send requests to this embedded HTTP server; for convenience, we supply a client program written in Ruby that simplifies most operations and shields the user from needing to know anything about JSON. It accepts and produces Gaggle broadcasts using a very simple text format.

Disclaimer

This code is totally kludgy, hacky, and experimental. Many things are done in weird ways and are not guaranteed to work. In particular, it has not been tested with large broadcasts.

This project is a stopgap solution to fill some needs. We are considering adding additional capabilities to the Gaggle, along the lines of an Enterprise Service Bus (without breaking compatibility with the current Gaggle). This would provide a more robust solution to the issues addressed here.

Known Issues

These issues are being worked on at the present time.

* I am not sure which is the best text format to use for serializing networks with node and edge attributes. Until that is decided, raw JSON will be used. More about this below.
* Tuples are not supported at this time.

How To Use The Script Goose

First, start the Gaggle Boss by clicking on this link. Or start the boss from the command-line as follows:

javaws http://gaggle.systemsbiology.net/2007-04/boss.jnlp

Note: Normally, the Boss starts automatically. The script goose is still in the experimental phase so it is recommended to start the Boss manually to avoid any problems.

Next, start the script goose by clicking on this link, or start it from the command line as follows:

javaws http://gaggle.systemsbiology.net/2007-04/jsongoose/jsongoose.jnlp

Now, download the client software. The simplest way to do this is to right-click on this link and save the resulting file as client.rb.

On Mac and Unix systems, you can then make this file executable with the following command line:

chmod +x client.rb

Setting up your Ruby Environment

Before you can run this Ruby script, you must make sure you have Ruby and its JSON library installed properly.

If you are on a Mac or Linux/Unix system, Ruby is probably already installed. If not, install it from http://ruby-lang.org.

Note to Windows users at ISB: Install Ruby in some location underneath your home directory, otherwise IT intervention will be required for installation and every time you want to add a Ruby library.

Installing the JSON Library For Ruby

Users of all operating systems must complete this step. At a command line, issue this command (Mac/Unix users may need to prepend sudo to this command):

gem install json

If that does not work for any reason, try installing the pure Ruby variant, as follows:

gem install json_pure

More information is available at http://json.rubyforge.org/.

Running the Client Program

You can now run the client program by issuing the following command:

ruby client.rb

This produces the following help message:

JsonGoose command-line client.
Full documentation at:
http://gaggle.systemsbiology.net/wiki/doku.php?id=json_script_goose

command-line switches:
--name/-n name: specify a name for the broadcast
--species/-s species:    specify a species name for the broadcast
--target/-t target:    specify the name of a goose to broadcast to
--file/-f filename: gets input from filename or standard input if filename is "-"
--execute/-e:   executes arbitrary JSON-RPC code contained in file specified by --file/-f flag
--getgoosename: gets the name of this goose
--getgoosenames:    gets names of all geese, including this one
--hide: hides target goose (specified by --target/-t flag)
--show: shows target goose (specified by --target/-t flag)
--showjson shows JSON received
--quit: quits the Script Goose
--getmethodnames:   gets a list of methods which can be called via JSON-RPC
--broadcast:    broadcasts the payload specified by the --file/-f flag
--getbroadcast: gets latest broadcast received, or waits for broadcast
--suppressmetadata: suppresses metadata lines (starting with #) at beginning of broadcast output
--help: prints this help message

This should be fairly self-explanatory. For ease of exploration, fire up a Sample Goose by clicking here, or to start it from the command line as follows:

javaws http://gaggle.systemsbiology.net/2007-04/sample.jnlp

To see what your Script Goose is called, issue this command:

ruby client.rb --getgoosename

To get a list of all geese currently in the Gaggle:

ruby client.rb --getgoosenames

Sending Broadcasts

There are two ways to use this client to send broadcasts to Gaggle. One is to put the broadcast data in a file, and the other is to pipe it to the client program. Here are examples of each. To send a namelist broadcast, simply create a file containing the names you want to broadcast, and call it names:

Example names file:

name1
name2
name3

Once you have this file, you can broadcast it by telling the client to open the file and broadcast it:

ruby client.rb --broadcast --file names

This should send a broadcast; check your sample goose to make sure. Note that the client program automatically figured out the broadcast type (namelist in this case).

The other way to cause a broadcast is to pipe the broadcast text to the client program, by specifying ‘-’ as the filename. (on Windows, use the type command instead of the cat command). Example:

cat names | ruby client.rb --broadcast --file -

In this example we are simply piping the contents of a file to the client program; however, this shows how a broadcast could be generated directly by a script. Furthermore, the output of the client.rb script can in turn be piped to other scripts.

You can further refine your broadcast command with the "--target", "--species", and "--name" flags.

Other Broadcast File Formats

In the same way that a name list is just a file with a list of names, one on every line, the client program accepts other simple text formats:

Example Cluster:

gene1   gene2   gene3   gene4
condition1      condition2

This is a two-line text file. The first line contains genes of interest, separated by tabs. The second line contains conditions in which the genes are interesting, separated by tabs.

Example Matrix:

GENE    c1      c2      c3
g1      1.0     2.1     3.2
g2      2.2     3.3     4.4
g3      6.6     7.7     8.8
g4      5.4     3.2     1.1

This is a fairly self-explanatory tab-delimited matrix file.

Networks

It is a bit more troublesome to come up with a simple text format for networks. SIF is very simple of course, but does not handle attributes, and NOA and EDA formats rely on having multiple files, one for each attribute, and contain a lot of redundant information. So being able to express a network, with attributes, in a single file is something I want to give more thought to. There are formats like GraphML and GML but it might be overkill to build in a parser for one of those. If you can think of a good, simple, non-bloated format for describing networks, let me know.

In the meantime, the script goose uses the raw JSON serialization of the class JsonSerializableNetwork. You can see what this looks like here. Note that the JSON is a bit bloated by Java class hinting; this is not necessary in most cases, but it is here because this snippet originated as a broadcast from Java. More information is available at http://jabsorb.org/Manual#head-11bc691cce47ec200722a1c436efe0f4376b6c1a.

Tuples

Tuples are not yet supported, but it ought to be fairly easy. JSON seems like a logical way to represent this kind of data structure. We’ll see if that makes sense in practice.

Receiving Broadcasts

Each time the Script Goose receives a broadcast from another Gaggle Goose, it stores the broadcast so that it can be downloaded by the command-line client. If you send two broadcasts to the Script Goose, only the second one will be stored.

The way to receive a broadcast is as follows:

ruby client.rb --getbroadcast

This command will do one of two things. If the Script Goose has received a broadcast, this command will immediately print out the broadcast in one of the simple text formats described above. If the Script Goose has not received a broadcast, this command will wait until a broadcast is sent, and then immediately print out the broadcast.

Note that, as with sending broadcasts, the client program automatically determines the type of the broadcast.

Script Metadata

If you run the above command, and then send a namelist broadcast from the sample goose (pressing the ‘B’ button), the output looks like the following:

#name=
#species=Saccharomyces cerevisiae
#source=Sample
YFL036W
YFL037W
YLR212C
YLR213C
YML085C
YML086C
YML123C
YML124C

The actual broadcast data is preceded by metadata lines (starting with #) which tell you the broadcast name, species name, and source goose name.

If you do not want this metadata in your broadcast output, add the '--suppressmetadata' flag:

ruby client.rb --getbroadcast --suppressmetadata

Examples of Use

You could do something like the following (this assumes you have already started the Boss and the Script Goose as described at the top of this page).

  • Broadcast a matrix from the Data Matrix Viewer (DMV) to the Script Goose.
  • Using a script written in any language, retrieve the matrix by calling the command ruby client.rb --getbroadcast from your script.
  • Start up an MeV goose.
  • Your script can then perform some arbitrary processing on the matrix, and then pipe it to the client program in the tab-delimited format described above, telling the client to broadcast the processed matrix to MeV. This would be the equivalent of the following command:
cat matrix.txt | ruby client.rb --broadcast --file - --species "Halobacterium sp. NRC-1" --target "Multiple Array Viewer" --name "my processed matrix"

Your custom script could tie together many such commands to enable lots of functionality, such as automatic rebroadcasting.

Bypassing the client script (Calling the goose with JSON-RPC directly)

You can call the JSON Goose directly by sending a POST request containing well-formed JSON-RPC commands to this url:

http://localhost:9010/JSON-RPC

The objects exposed are called “boss” and “goose” and the methods you can call are shown in the java classes: http://gaggle.systemsbiology.net/svn/gaggle/gaggle/trunk/src/main/org/systemsbiology/gaggle/core/Boss.java and http://gaggle.systemsbiology.net/svn/gaggle/jsongoose/trunk/src/main/java/org/systemsbiology/gaggle/geese/jsongoose/JsonGoose.java

Note that for broadcasting you should not call the boss.broadcast...() methods; call the goose.broadcast...() methods instead. For these you don’t need to specify the name of the source goose–that gets filled in on the java side.

The format for a JSON-RPC function call is described in the JSON-RPC Specification.

(examples will be added here)

Source Code

You can check out the source code from the Subversion repository by using this command:

svn checkout http://gaggle.systemsbiology.net/svn/gaggle/jsongoose/trunk/ jsongoose

That will create a directory called ‘jsongoose’ and download the source code into that. Note: I need to write an Ant build script to automate the build. The Maven pom.xml file will probably not work.

You can also browse the source code by going to http://gaggle.systemsbiology.net/svn/gaggle/jsongoose/trunk/.

Comments and Suggestions

All types of feedback are most welcome. Send me email, edit this page, change the source code directly, do whatever you like (but let me know what you are doing).

Thanks.

Dan

dtenenbaum@systemsbiology.org

 
json_script_goose.txt · Last modified: 2009/08/11 14:08 by 10.10.2.164
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki