Getting Started

The four most important classes of BridgeDb are:

Suppose that you have an identifier in Ensembl, for example  ENSG00000171105

The BioDataSource class enumerates a set of constant data sources. For Ensembl Human the constant value is BioDataSource.ENSEMBL_HUMAN The Xref is the combination of BioDataSource.ENSEMBL_HUMAN plus the identifier itself as a String. Gdb is an interface for

In java code:

Xref ref = new Xref("ENSG00000171105", BioDataSource.ENSEMBL_HUMAN);

System.out.println ("<a href=\"" + ref.getUrl() + "\">clicky</a>");

Mapping / Translating

If you want to translate one id to another system, you have to use an IDMapper. An IDMapper is a connection to a database or webservice that knows how to translate identifiers. You can use one of the webservices (Ensembl BioMart, CRONOS, PICR, Synergizer) as well as local text files and local Derby databases for increased efficiency and control.

In java code:

// first we have to load the driver
// and initialize information about DataSources
Class.forName("org.bridgedb.webservice.bridgerest.BridgeRest");
BioDataSource.init();

// now we connect to the driver and create a IDMapper instance.
IDMapper mapper = BridgeDb.connect ("idmapper-bridgerest:http://webservice.bridgedb.org/Human");

// We create an Xref instance for the identifier that we want to look up.
// In this case we want to look up Entrez gene 3643.
Xref src = new Xref ("3643", BioDataSource.ENTREZ_GENE);

// let's see if there are cross-references to Ensembl Human
Set<Xref> dests = mapper.mapID(src, DataSource.getBySystemCode("EnHs"));

// and print the results.
// with getURN we obtain valid MIRIAM urn's if possible.
System.out.println (src.getURN() + " maps to:");
for (Xref dest : dests)
        System.out.println("  " + dest.getURN());

This produces the following output:

urn:miriam:entrez.gene:3643 maps to:
  urn:bridgedb:ensembl.human:ENSG00000171105

If you're not particular about type of identifier you get back, you can simply leave off the second argument of mapper.mapID:

Set<Xref> dests = mapper.mapID(src);

If you use the above line in place of the original, you get dozens of different identifiers as a result.

Searching

In the example above, you had to specify the  DataSource of the input to mapper.mapID. This way there is no ambiguity about the type of identifier that you want to map.

What if you're given an identifier as a string and you don't know the input type? In that case you can use free search. (Note that not all mappers support free search, but idmapper-bridgerest does).

After the same setup as the previous example, we can use the freeSearch method to do a query for an identifier string without specifying the id type.

String query = "3643";
        
// let's do a free search without specifying the input type:
Set<Xref> hits = mapper.freeSearch(query, 100);

// Now print the results.
// with getURN we obtain valid MIRIAM urn's if possible.
System.out.println (query + " search results:");
for (Xref hit : hits)
        System.out.println("  " + hit.getURN());

Here is a sample of the results. If you want to filter further down you have to add your own code for that.

3643 search results:
  urn:bridgedb:affymetrix:3643427
  urn:bridgedb:affymetrix:3463643
  urn:miriam:entrez.gene:283643
  urn:bridgedb:affymetrix:3643367
  urn:bridgedb:illumina:GI_21536438-A
  urn:bridgedb:affymetrix:3364396
  urn:miriam:refseq:NP_036430

... more results omitted ...

Guessing

It's possible to guess identifier type based on a predefined set of regular expression patterns (These patterns are amongst others based on the MIRIAM registry). Of course this is not a fool-proof way to determine the type of identifier, but it can be helpful nontheless. Identifiers like "ENSG00000171105" can be recognized without problem as coming from Ensembl. An identifier that is just an integer like Entrez Gene ID's are more ambiguous. In the example below, a numeric identifier will give a list of possible results.

In practice, if you need to know the type of identifier we recommend that you let the end-user select from a combo-box, but use the guessing mechanism to set the default value of the combo box. In that way the user has full control, but when the guessing mechanism gets it right you'll have prevented the need to scroll through a long list of possibilities, which makes the application more user-friendly.

// We have to initialize DataSource information,
// but we don't need a driver
BioDataSource.init();

String query = "NP_036430";
System.out.println ("Which patterns match " + query + "?");

// DataSourcePatterns holds a registry of patterns
Map<DataSource, Pattern> patterns = DataSourcePatterns.getPatterns();

// loop over all patterns
for (DataSource key : patterns.keySet())
{
        // create a matcher for this pattern
        Matcher matcher = patterns.get(key).matcher(query);
        
        // see if the input matches, and print a message
        if (matcher.matches())
        {
                System.out.println (key.getFullName() + " matches!");
        }
}               

This produces the following output:

Which patterns match NP_036430?
RefSeq matches!

More

Free search sample

 This blog post contains an advanced example in Groovy that uses free attribute search to match gene names with identifiers.