Guide for Choosing and Connecting to a Mapping Service
Note This is a guide for developers
Which mapper to choose?
Mappers can be categorized in three broad categories: web services, relational databases and flat files. Within these three categories, you have more fine-grained options:
- web services
When using webservices, the mapping information is fetched from a server over the internet. These are easy to install and maintain, because that is all done by the owner of the server. On the other hand, you have to rely on an external organization, and servers can go down. You don't have control over which version of the data is used.
- BridgeWebservice provided by us, simple to use and also does metabolite identifier mapping
- BioMart - most commonly used, provides mappings from ensembl for a wide range of species
- PICR - particularly good at protein identifier mapping
- Synergizer - relatively fast webservice that provided gene mappings from both ncbi and ensembl
- CRONOS - an alternative webservice
- relational databases
Local databases give you maximum control and are also highly efficient, because of low network latency. On the other hand, they are hard to set up and it is your own responsibility to keep up-to-date.
- BridgeDerby To help you out a bit, we provide single-file databases based on the Derby relational database system. These files you can download and copy, you just need to point BridgeDb? to a valid derby database file and it should work.
- Any other JDBC driver. If you've got time to spare, you can go for any relational database system such as MySQL, Oracle, PostgreSQL, mSQL, etc. We've tested extensively on MySQL.
- Flat files
Flat files are easy to create, so these give you maximum control and sometimes are your only option when you're dealing with custom identifiers, e.g. from your own microarray design or database.
- Tab-delimited text files consist of two columns. One column should contain identifiers in one datasource, the other column identifiers in the other datasource. If one identifier maps to multiple other identifiers, then you have to create a row for each identifier. You can create the file by hand, with a script, or by exporting an Excel file to tab-delimited text.
Note that often you can combine two mappers in a stack, to give you the advantages of both.
Details about each mapper
After you've decided which mapper(s) you want to use, you have to know how to connect to them. For each mapper, you need to know three things:
- the name of the driver class to load
- which java libraries to include in the class path
- how to form a properly formatted connection string.
For using a particular mapper, you need to have the right jar files in your projects classpath. Then you can connect to the mapper using the following pattern:
Class.forName("driver class");
IDMapper mapper = BridgeDb.connect("protocol:base?param1=value1¶m2=value2");
A connection string consists of the following parts:
- a protocol, e.g. "idmapper-jdbc" or "idmapper-text"
- the base, this is usually a URL or path to a file.
- parameters, often optional.
1.1 BridgeWebservice
- Jar files
- org.bridgedb.webservice.bridgerest.jar, org.bridgedb.jar
- Driver
- org.bridgedb.webservice.bridgerest.BridgeRest?
- Protocol
- idmapper-bridgerest
- Base
- URL of the webservice. Note that in " http://webservice.bridgedb.org/Human", "/Human" part is considered part of the webservice URL.
- Parameters
- None
- Example
-
Class.forName("org.bridgedb.webservice.bridgerest.BridgeRest"); IDMapper mapper = BridgeDb.connect("idmapper-bridgerest:http://webservice.bridgedb.org/Human");
1.2 Biomart
- Jar files
- org.bridgedb.webservice.biomart.jar, org.bridgedb.jar
- Driver
- org.bridgedb.webservice.biomart.IDMapperBiomart
- Protocol
- idmapper-biomart
- Base
- URL of the webservice, for example http://www.biomart.org/biomart/martservice
- Parameters
- "mart", which indicates the mart to use, for example "ensembl". Required. "dataset", which indicates the dataset to use, for example "hsapiens_gene_ensembl". Required.
- Example
-
Class.forName("org.bridgedb.webservice.biomart.IDMapperBiomart"); IDMapper mapper = BridgeDb.connect("idmapper-biomart:http://www.biomart.org/biomart/martservice?mart=ensembl&dataset=hsapiens_gene_ensembl");
1.3 PICR
- Jar files
- org.bridgedb.webservice.picr.jar, org.bridgedb.jar, activation, commons-collection, saaj, jaxb, jaxws, jsr, resolver, sjsxp, stax, streambuffer, xfire
- Driver
- org.bridgedb.webservice.picr.IDMapperPicr
- Protocol
- idmapper-picr
- Base
- none. The webservice is hard-coded
- Parameters
- none
- Example
-
Class.forName("org.bridgedb.webservice.picr.IDMapperPicr"); IDMapper mapper = BridgeDb.connect("idmapper-picr:");
1.4 Synergizer
- Jar files
- org.bridgedb.webservice.synergizer.jar, org.bridgedb.jar, synergizer-client.jar
- Driver
- org.bridgedb.webservice.synergizer.IDMapperSynergizer
- Base
- None
- Parameters
- authority, can be either "ncbi" or "ensembl", species: use full scientific name
- Examples
-
Class.forName("org.bridgedb.webservice.synergizer.IDMapperSynergizer"); IDMapper mapper = BridgeDb.connect("idmapper-synergizer:authority=ensembl&species=Homo sapiens"); IDMapper mapper = BridgeDb.connect("idmapper-synergizer:authority=ncbi&species=Homo sapiens");
1.5 Cronos
- Jar files
- org.bridgedb.webservice.cronos.jar, org.bridgedb.jar, axis.jar, commons-logging, commons-discovery, jaxrpc, log4j, saaj, wsdl
- Driver
- org.bridgedb.webservice.cronos.IDMapperCronos
- Protocol
- "idmapper-cronos"
- Base
- 3-letter code for species, such as "hsa" for Homo sapiens
- Parameters
- None
- Example
-
Class.forName("org.bridgedb.webservice.cronos.IDMapperCronos"); IDMapper mapper = BridgeDb.connect("idmapper-cronos:hsa");
2.1 BridgeDerby
- Jar files
- org.bridgedb.rdb.jar, org.bridgedb.jar, derby.jar
- Driver
- org.bridgedb.rdb.IDMapperRdb
- Protocol
- idmapper-pgdb
- Base
- full path to *.bridge or *.pgdb file
- Parameters
- None
- Example
-
Class.forName("org.bridgedb.rdb.IDMapperRdb"); IDMapper mapper = BridgeDb.connect("idmapper-pgdb:/home/martijn/PathVisio-Data/gene databases/Mm_Lite.pgdb");
2.2. Other relational databases
- Jar files
- org.bridgedb.rdb.jar, org.bridgedb.jar, jdbc driver of choice
- Driver
- org.bridgedb.rdb.IDMapperRdb
- Protocol
- idmapper-jdbc
- Base
- JDBC connection string, excluding the "jdbc" protocol part
- Parameters
- passed on directly to JDBC driver
- Example
-
Class.forName("org.bridgedb.rdb.IDMapperRdb"); // Note: don't forget to load the regular JDBC driver class Class.forName ("com.mysql.jdbc.Driver"); IDMapper mapper = BridgeDb.connect("idmapper-jdbc:mysql://localhost/snp?user=bridgedb");
3.1 Tab-delimited text file
- Jar files
- org.bridgedb.jar
- Driver
- org.bridgedb.file.IDMapperText
- Protocol
- idmapper-text
- Base
- full URL of file. In case of a local file, use " file://" plus the path of the file.
- Parameters
- None
- Example
-
Class.forName("org.bridgedb.file.IDMapperText"); IDMapper mapper = BridgeDb.connect("idmapper-text:file:///home/martijn/array_annotation/agilent_to_ensembl57_blasted.txt"); Xref src = new Xref ("P07215", DataSource.getByFullName("UniProt/SwissProt Accession") );
For more information about formatting the text file, see: FlatFiles
Note that the DataSource? is available by the column name (in this example "UniProt/SwissProt? Accession") that has been supplied by the local file.
