Guide for Choosing and Connecting to a Mapping Service

Note This is a guide for developers

Which mapper to choose?

Mappers can be categorized in three broad categories: web services, relational databases and flat files. Within these three categories, you have more fine-grained options:

  1. web services When using webservices, the mapping information is fetched from a server over the internet. These are easy to install and maintain, because that is all done by the owner of the server. On the other hand, you have to rely on an external organization, and servers can go down. You don't have control over which version of the data is used.
    1. BridgeWebservice provided by us, simple to use and also does metabolite identifier mapping
    2. BioMart - most commonly used, provides mappings from ensembl for a wide range of species
    3. PICR - particularly good at protein identifier mapping
    4. Synergizer - relatively fast webservice that provided gene mappings from both ncbi and ensembl
    5. CRONOS - an alternative webservice
  1. relational databases Local databases give you maximum control and are also highly efficient, because of low network latency. On the other hand, they are hard to set up and it is your own responsibility to keep up-to-date.
    1. BridgeDerby To help you out a bit, we provide single-file databases based on the Derby relational database system. These files you can download and copy, you just need to point BridgeDb? to a valid derby database file and it should work.
    2. Any other JDBC driver. If you've got time to spare, you can go for any relational database system such as MySQL, Oracle, PostgreSQL, mSQL, etc. We've tested extensively on MySQL.
  1. Flat files Flat files are easy to create, so these give you maximum control and sometimes are your only option when you're dealing with custom identifiers, e.g. from your own microarray design or database.
    1. Tab-delimited text files consist of two columns. One column should contain identifiers in one datasource, the other column identifiers in the other datasource. If one identifier maps to multiple other identifiers, then you have to create a row for each identifier. You can create the file by hand, with a script, or by exporting an Excel file to tab-delimited text.

Note that often you can combine two mappers in a stack, to give you the advantages of both.

Details about each mapper

After you've decided which mapper(s) you want to use, you have to know how to connect to them. For each mapper, you need to know three things:

  • the name of the driver class to load
  • which java libraries to include in the class path
  • how to form a properly formatted connection string.

For using a particular mapper, you need to have the right jar files in your projects classpath. Then you can connect to the mapper using the following pattern:

Class.forName("driver class");
IDMapper mapper = BridgeDb.connect("protocol:base?param1=value1&param2=value2");

A connection string consists of the following parts:

  • a protocol, e.g. "idmapper-jdbc" or "idmapper-text"
  • the base, this is usually a URL or path to a file.
  • parameters, often optional.

1.1 BridgeWebservice

Jar files
org.bridgedb.webservice.bridgerest.jar, org.bridgedb.jar
Driver
org.bridgedb.webservice.bridgerest.BridgeRest?
Protocol
idmapper-bridgerest
Base
URL of the webservice. Note that in " http://webservice.bridgedb.org/Human", "/Human" part is considered part of the webservice URL.
Parameters
None
Example
  Class.forName("org.bridgedb.webservice.bridgerest.BridgeRest");
  IDMapper mapper = BridgeDb.connect("idmapper-bridgerest:http://webservice.bridgedb.org/Human");

1.2 Biomart

Jar files
org.bridgedb.webservice.biomart.jar, org.bridgedb.jar
Driver
org.bridgedb.webservice.biomart.IDMapperBiomart
Protocol
idmapper-biomart
Base
URL of the webservice, for example  http://www.biomart.org/biomart/martservice
Parameters
"mart", which indicates the mart to use, for example "ensembl". Required. "dataset", which indicates the dataset to use, for example "hsapiens_gene_ensembl". Required.
Example
  Class.forName("org.bridgedb.webservice.biomart.IDMapperBiomart");
  IDMapper mapper = BridgeDb.connect("idmapper-biomart:http://www.biomart.org/biomart/martservice?mart=ensembl&dataset=hsapiens_gene_ensembl");

1.3 PICR

Jar files
org.bridgedb.webservice.picr.jar, org.bridgedb.jar, activation, commons-collection, saaj, jaxb, jaxws, jsr, resolver, sjsxp, stax, streambuffer, xfire
Driver
org.bridgedb.webservice.picr.IDMapperPicr
Protocol
idmapper-picr
Base
none. The webservice is hard-coded
Parameters
none
Example
  Class.forName("org.bridgedb.webservice.picr.IDMapperPicr");
  IDMapper mapper = BridgeDb.connect("idmapper-picr:");

1.4 Synergizer

Jar files
org.bridgedb.webservice.synergizer.jar, org.bridgedb.jar, synergizer-client.jar
Driver
org.bridgedb.webservice.synergizer.IDMapperSynergizer
Base
None
Parameters
authority, can be either "ncbi" or "ensembl", species: use full scientific name
Examples
  Class.forName("org.bridgedb.webservice.synergizer.IDMapperSynergizer");
  IDMapper mapper = BridgeDb.connect("idmapper-synergizer:authority=ensembl&species=Homo sapiens");
  IDMapper mapper = BridgeDb.connect("idmapper-synergizer:authority=ncbi&species=Homo sapiens");

1.5 Cronos

Jar files
org.bridgedb.webservice.cronos.jar, org.bridgedb.jar, axis.jar, commons-logging, commons-discovery, jaxrpc, log4j, saaj, wsdl
Driver
org.bridgedb.webservice.cronos.IDMapperCronos
Protocol
"idmapper-cronos"
Base
3-letter code for species, such as "hsa" for Homo sapiens
Parameters
None
Example
  Class.forName("org.bridgedb.webservice.cronos.IDMapperCronos");
  IDMapper mapper = BridgeDb.connect("idmapper-cronos:hsa");

2.1 BridgeDerby

Jar files
org.bridgedb.rdb.jar, org.bridgedb.jar, derby.jar
Driver
org.bridgedb.rdb.IDMapperRdb
Protocol
idmapper-pgdb
Base
full path to *.bridge or *.pgdb file
Parameters
None
Example
  Class.forName("org.bridgedb.rdb.IDMapperRdb");
  IDMapper mapper = BridgeDb.connect("idmapper-pgdb:/home/martijn/PathVisio-Data/gene databases/Mm_Lite.pgdb");

2.2. Other relational databases

Jar files
org.bridgedb.rdb.jar, org.bridgedb.jar, jdbc driver of choice
Driver
org.bridgedb.rdb.IDMapperRdb
Protocol
idmapper-jdbc
Base
JDBC connection string, excluding the "jdbc" protocol part
Parameters
passed on directly to JDBC driver
Example
  Class.forName("org.bridgedb.rdb.IDMapperRdb");
  // Note: don't forget to load the regular JDBC driver class
  Class.forName ("com.mysql.jdbc.Driver");
  IDMapper mapper = BridgeDb.connect("idmapper-jdbc:mysql://localhost/snp?user=bridgedb");

3.1 Tab-delimited text file

Jar files
org.bridgedb.jar
Driver
org.bridgedb.file.IDMapperText
Protocol
idmapper-text
Base
full URL of file. In case of a local file, use " file://" plus the path of the file.
Parameters
None
Example
  Class.forName("org.bridgedb.file.IDMapperText");
  IDMapper mapper = BridgeDb.connect("idmapper-text:file:///home/martijn/array_annotation/agilent_to_ensembl57_blasted.txt");
  Xref src = new Xref ("P07215", DataSource.getByFullName("UniProt/SwissProt Accession") );

For more information about formatting the text file, see: FlatFiles

Note that the DataSource? is available by the column name (in this example "UniProt/SwissProt? Accession") that has been supplied by the local file.