In biology, there is a large number of naming systems for orfs, genes, and their products. The Translator attempts to manage some of that complexity by allowing relatively painless conversion between one naming system and another.
Different naming systems are often mutually inconsistent, so mapping between them is destined to be a lossy process. That's lossy as in lossy data compression, not loosey as in loosey-goosey.
That said, the software supports a loose definition of translation, encompassing scenarios like mapping peptides to genes or mapping across species via COG membership. These go beyond simply exchanging one naming system for another. Maintaining the desired degree of rigor is up to the user's judgement.
In lieu of consistency, the Translator gives the user control. There are options to configure how the translation is performed and it is easy to modify an existing translation or load up a completely new one. This involves editing a tab-delimited text file.
If the Allow One-to-Many option is set, as it is by default, one term in the source namespace may map to more than one term in the target namespace. So, if you're translating a list of gene names, you may start with 10 names and end up with 11. Networks may expand during translation and data matrices to grow extra rows.
If Allow One-to-Many is false, a term in the source namespace will map to at most one term in the target namespace. If a translates to x, y, and z, the software has no way of knowing which, out of x, y, or z is the preferred translation. It makes the choice arbitrarily. If this is of concern, a hand curated one-to-one mapping is probably better.
The Drop Untranslatable Terms option controls what happens when there is no translation for a given term. It will either be dropped, or simple translate to itself.
To get a strict one-to-one mapping, set both allow one-to-many and drop untranslatable terms to false. Then every element in the source data structure will map to exactly one element in the translated data structure.
One goal of the Translator is to support mapping between species via orthology.
A quick demo
Use homologene data to map orthologs among several model organisms. The data file contains 44449 sets of homologous gene ids from 20 organisms.
|© 2006, Institute for Systems Biology, All Rights Reserved|