Tips of the Day.

Integrating web PSI-BLAST with command line PSI-BLAST using the PssmWithParameters format

Thu, 06 Sep 2007 15:00:00 EST

This format of the PSSM can be directly used with other stand-alone Blast software tools, in particular as an input checkpoint file for blastpgp. The actual matrix elements can be observed in the "scores" field in the PssmWithParameters structure, which is a one-dimensional representation of the matrix. To use the PssmWithParameters structure with blastpgp, save it as a plain-text ASCII file (e.g. PSSM.txt) and add the following command-line options to the specific blastpgp call: -R PSSM.txt -q 1

Using Tree View to Examine Relationships Between Sequences.

Fri, 13 Apr 2007 15:00:00 EST

The new Tree View option on the NCBI Web BLAST service presents a dendrogram or tree display that clusters sequences according to their distances from the query sequence. This display is helpful for recognizing the presence of aberrant or unusual sequences or potentially natural groupings of related sequences such as members of a gene families or homologs from other species in the BLAST output. Tree view

The figure above shows a radial tree display generated by searching against the 'refseq_genomic' database with the woolly mammoth complete mitochondrial genome (RefSeq accession NC_007596). The RefSeq genomic database was limited to the mammalian taxon 'afrotheria'. This tree reconstructs the accepted taxonomic groupings of these mammals and reinforces the proposition that the woolly mammoth is most closely related to the African and Asian elephants.

How to save custom search pages.

Fri, 13 Apr 2007 10:00:00 EST

So you have made a few BLAST searches and after adjusting the database, organism limits and maybe a few Algorithm Parameters you arrive at what you think is a good search strategy. Do you want to have to fiddle with pull down menus or remember all the changes you made the next time to want to run a similar search? Now you can use "Saved Strategies" to and always have a saved search template.

Using the MyNCBI service you can register for a private account and each time you log in you can have access to custom searches for BLAST as well as PubMed and other Entrez services.

My NCBI sign in box

Once you are signed into MyNCBI you will see a link at the top the BLAST results page which says "Save Search Strategies".

Save Search Strategies link

When this link is clicked the search page which generated the results you are viewing will be saved to be accessed at any time you return with your custom values, including the sequences entered for the search.

Each time you return to the BLAST pages you can select the Saved Strategies tab at the top of any BLAST page and see a list of your saved strategies.

List of your saved strategies

You can see the program, database and title information in the list and these columns can be used to sort your list. When you wish do remove a entry simply click the check box to the far right of the row. 


Use Genomic BLAST to see the genomic context

Fri, 13 Apr 2007 07:00:00 EST

If you are interested in the evolution of a particular gene or gene family it is often intetesting to examine the intro-exon structure even across species. Often, the only data available is the mRNA sequence from a cDNA or a curated database such as refseq. Is it possible, however, to see how the mRNA aligns to genomic sequence using BLAST and thus arrive at an idea of its possible intron exon structure. BLAST Assembled Genomes list

Genomic BLAST pages are helpful because they allow
the genomic context of a BLAST search to be displayed in the Map Viewer. For example using discontiguous MegaBLAST (cross-species Megablast) the human RefSeq transcript for albumin (NM_000477) can be used to identify the homolog in the rat genome (

BLAST RAT Sequences page

After formatting the results the hits to the genomic contigs can be seen.

Hits to the genomic contigs

Select the "GenomeView” button to see the hits arranged on the Rat genome map.

Rat genome map demonstration  

How to Search Custom Databases in Web-Blast Using Entrez Queries.

Fri, 13 Apr 2007 07:00:00 EST

A powerful feature of the BLAST Web interface is the ability to limit BLAST searches to a subset of any database using a standard Entrez query. Skillful use of Entrez queries allows the equivalent of on-the-fly construction of databases of exact composition Entrez queries are entered into a box on the BLAST search pages "Choose Search Set".

Choose Search Set demonstration

Note that there is a separate and similar box for limiting searches to Organisms. This becomes available when searching database which are not-species specific. A nice feature is that the Organism limits box is "auto-complete" and will attempt to create a list of matching species as you are typing them in.

Choose Search Set demonstartion with Organism box

It may helpful to first construct the query from within Entrez and verify that it returns the desired subset of sequences, before attempting to use it with BLAST. A successful Entrez text query may be pasted into the Entrez Limitation box on the BLAST page. Be sure that the database chosen is compatible with the Entrez limitation used.

For example, an Entrez query that picks up ESTs will be of no use when searching the nr database since nr contains no EST sequences.

In the simplest case, one might desire to search the nr protein database for matches to a particular type of protein from a particular class of organisms. The following Entrez query defines a search of only viral helicase proteins:

viruses[orgn] AND helicase [protein name]

Note that the limitation above will pick up annotated helicase proteins, but not unannotated proteins. It will, however, ensure that your results contain nothing but viral proteins annotated as helicase or helicase-related proteins.

How to do Batch BLAST jobs.

Fri, 13 Apr 2007 05:00:00 EST

BLAST makes it easy to examine a large group of potential gene candidates. Most likely these are isolated as amplified products from a library of some sort. There is no need to manually cut and paste a 100 sequences in to the BLAST web pages. Using the BLAST web pages it is possible to input "batches" of sequences into one form and retireve the results. There are two methods to do batch BLAST jobs. The first is through the web interface and the second is using the standalone BLAST binaries and downloaded NCBI databases. More information on the binaries is located here and help with the installation is available through


If you are going to submit a batch BLAST search on the web we recommend that you do not submit a file of more than 50 sequences.

Select a BLAST search page form the main BLAST home page. Next you can either cut and paste multiple FASTA sequences from a text file into the main input box.

Main input box on the BLAST search page

Or alternatively, you can use the browse button to import a local file from your computer.

One additional tip: Since BLAST results can often be large in their default format as HTML, you may want to use the more concise Hit Table out put.

After submitting a BLAST search on the Formatting Results page is a link, "Formatting options". [This link is also available from any BLAST results page as "Reformat these Results".]

On the formatting screen use the Alignment View pull down to select "Hit Table".

BLAST Format Request example

Using Genomic BLAST.

Fri, 13 Apr 2007 04:00:00 EST

Genomic BLAST pages are helpful because they allow the genomic context of a BLAST search to be displayed in the Map Viewer. For example, discontiguous (cross-species) MegaBLAST against the human RefSeq transcript for albumin (NM_000477) can be used to identify the homolog in the rat genome.