2019 BLAST NEWS

Mon, 23 Dec 2019

BLAST+ 2.10.0 is released - Improved Composition-based statistics.

We have updated the BLAST process to improve the stability of BLAST results against changes in the number of results requested. We have also added an experimental option which increases the likelihood of finding novel results. To enable: set the environment variable ADAPTIVE_CBS to 1. Your feedback on this option is is welcome.

In addition, the new version fixes several bugs. See the release notes for more details at: https://www.ncbi.nlm.nih.gov/books/NBK131777/
New version 5 databases are at https://ftp.ncbi.nlm.nih.gov/blast/db/v5
Read more about the version 5 database at https://ftp.ncbi.nlm.nih.gov/blast/db/v5/blastdbv5.pdf
As a reminder, we will discontinue updates to the older BLAST databases format (BLASTDBv4) in early 2020.

Tue, 10 Dec 2019

We have added 3 new fungal targeted loci databases to help you identify organisms.

For initial searches, the 16S and targeted loci databases contains the data that most people need to identify these organisms.

Using these databases will speed up your searches and provide you the results that you are most likely looking for. To search these databases, follow these steps:
  1. On the BLAST home page select the Nucleotide BLAST suite.

  2. Select rRNA/ITS as the Database set.

  3. Select the appropriate database for your query.

Fri, 27 Sep 2019

End of updates for BLAST+ version 4 databases (dbV4)

Start moving to the new version 5 databases!

We recently updated the version 5 BLAST protein and nucleotide databases, (dbV5 on our FTP site to be accession-based. As we described in a previous post, this means they now contain the gi-less proteins from the NCBI Pathogen Project and other high-throughput projects. The v5 databases are also compatible with proteins from PDB structures with multi-character chain identifiers and will include these as they become available in our other protein systems. Only the latest version of BLAST+ (2.9.0, download ) will work with the updated v5 databases and allow you to access all of the most recent protein and nucleotide data. In the winter of 2019, we will stop updating the version 4 BLAST databases and offer the v5 databases as the default for download.

Beginning with BLAST 2.10.0 – due out in October 2019, the program makeblastdb will produce dbV5 databases by default.

For more information on the new database version and BLAST+ (2.9.0), see the previous NCBI Insights article and the recording of our recent webinar.

Wed, 28 Aug 2019

A New version of Magic-BLAST(1.5.0) is here.

The BLAST tool for mapping large next-generation RNA or DNA sequencing runs against a whole genome or transcriptome.

Magic-BLAST, the BLAST tool that aligns next generation sequencing reads, has just been released with new user driven enhancements:

  • Aligns nanopore sequences

  • Improved multithreading performance

  • Supports the new BLAST database version (BLASTDBv5) that allows you to limit your search by taxonomy (more information about database version 5 here https://ftp.ncbi.nlm.nih.gov/blast/db/v5/blastdbv5.pdf )

  • More reliable placements of reads

A new paper, published in BMC Bioinformatics (July, 2019), describes Magic-BLAST and compares it to other popular aligners. It can be viewed at https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2996-x

The release notes are available at https://ncbi.github.io/magicblast/release/release.html

The new executables are available on the NCBI FTP site at https://ftp.ncbi.nlm.nih.gov/blast/executables/magicblast/LATEST

Read more here about Magic-BLAST here https://ncbi.github.io/magicblast .

Thu, 15 Aug 2019

New nr database available with fewer redundant titles

​We have made changes to the nr version 5 database, (nr_v5), to facilitate better search results and improved performance.

We have reduced the number of redundant titles in the nr_v5 database used by webBLAST, which is also available for BLAST+ users.

  • The changes in nr preserve the taxonomic diversity of the entries in the database while reducing the number of titles for identical sequences. GenPept accessions are still accessible via www.ncbi.nlm.nih.gov/protein/$GENBANK_ACCESSION or the IPG website https://www.ncbi.nlm.nih.gov/ipg/. The “Identical Proteins” link in the alignments section of the webBLAST results takes you to a full list of all accessions associated with a sequence.

  • For BLAST+ users downloading nr_v5 the database is now approximately 50% smaller, resulting in faster downloads and BLAST searches, and smaller disk space requirements. The database is downloadable at https://ftp.ncbi.nlm.nih.gov/blast/db/v5/

  • For BLAST+ there is a cleanup script to help you manage the transition to this smaller database. The script removes unused database volumes https://ftp.ncbi.nlm.nih.gov/blast/temp/cleanup-blastdb-volumes.py

Here are the new rules on how we keep titles in nr_v5:

  • We keep all refseq, swissprot, pir and PDB titles.

  • We keep any GenPept titles with a TAXID that has not already been seen in the record.

  • We keep at least five GenPept titles regardless of whether the TAXIDS have been seen before or not in this record.

Wed, 31 Jul 2019

New BLAST Results is now the default view

Based on user feedback the New Results page is an overwhelming success.

The new BLAST results page that has been available for testing since April is now the default results page. Thank you for your comments and feedback on this new output. We have made several changes to the page that address issues or problems that you have pointed out and are also working on adding several additional features that you have suggested in future releases. We will still provide access to the old results for some time to allow people who have workflows or teaching materials to adjust to the new display..

Thu, 27 Jun 2019

The BLAST programs and databases are now cloud ready

​NCBI now provides a dockerized version of BLAST that you can use on the cloud.

BLAST workloads often come in bursts. You may want to search a large number of sequences all at once and need the results as soon as possible to enable further analysis. Often the number of sequences and rapid turnaround needed preclude using a web service. In many situations like this you also don’t have a continuous need that would justify an investment in your own dedicated server. Using BLAST in the cloud environment cloud is an ideal solution to this dilemma. The BLAST databases have also been moved to the cloud allowing you to run computations close to where the data is, eliminating the time and resources needed to download large data files to your local network. In the cloud, you are not restricted by your local compute resources or limits on public web services and don’t have to buy compute power that sits idle most of the time.

This implementation of BLAST has been tested on the Google Cloud environment, however by using open and de facto standards such as Docker and Linux commands, it should be easy to port to other cloud platforms and operating environments. Instruction have been created to get started using BLAST in the Cloud and for database information.

Thu, 30 May 2019

New BLAST Results to become default

To help instructors integrate the new design into their lesson plans, we are making the change before the fall semester.

The new BLAST results page that has been available for testing since April will become the default results page for everyone on Aug 1, 2019. Thank you for your comments and feedback on this new output. We have made several changes to the page that address issues or problems that you have pointed out and are also working on adding several additional features that you have suggested in future releases. We will still provide access to the old results for some time to allow people who have workflows or teaching materials to adjust to the new display.

Wed, 15 May 2019

A new version IgBLAST (1.14.0) is here.

We’ve released a new version of IgBLAST with three new improvements.

The new version of IgBLAST is now available with three new features:
  • Implementation of AIRR format is more consistent with AIRR specs including changing undefined type (NON, N/A) to empty string, not appending “reversed” to seqid when query is in reversed orientation, using standard locus names such as IGH, TRB instead of traditional VH, VB etc.

  • Improved logic for showing CDR3 end.

  • Restoring seqid for no result case.

The new release is available on https://ftp.ncbi.nlm.nih.gov/blast/executables/igblast/release/LATEST

The new manual is on GitHub`<https://ncbi.github.io/igblast/>`_

IgBLAST facilitates the analysis of immunoglobulin and T cell receptor variable domain sequences. Read more here: https://ncbi.github.io/igblast/ and here https://www.ncbi.nlm.nih.gov/pubmed/23671333

Tue, 23 Apr 2019

New BLAST Results Page in Beta

A user driven experiment to improve the BLAST solution.

The design of this new Results page is based on feedback and interviews, andthe goal is better usability and an attempt to better expose desired features that were mostly hidden from users.

This new Results page may eventually be incorporated in BLAST, in its current or a revised form, based on your input. Please try this experiment and let us know what you think via the feedback button.

To access the new Results page, select it from the BLAST Search page or the current Results page.

Tue, 02 Apr 2019

BLAST+ 2.9.0 is here.

BLAST+ 2.9.0 is released -enhanced support for the new database format</span>.

This version enhances support for the new BLAST database version (BLASTDBv5). This includes:

  1. Support for the RCSB Protein Data Bank (PDB) changes. Specifically, the PDB now permits individual biopolymer chains to have identifiers up to four-characters long. Formerly, only single-character chain identifiers could be assigned. Additional details about NCBI’s 3D structure resources and how they can be used are available at https://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtml .

  2. Bug fixes. See the release notes are at https://www.ncbi.nlm.nih.gov/books/NBK131777/

The new executables are at https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST

New version 5 databases are at https://ftp.ncbi.nlm.nih.gov/blast/db/v5

Read moreabout the version 5 database at https://ftp.ncbi.nlm.nih.gov/blast/db/v5/blastdbv5.pdf

We will continue to update the BLAST databases in their current version (BLASTDBv4) until September 2019.

Fri, 08 Mar 2019

A new version IgBLAST (1.13) is here.

Your tool to facilitate the analysis of immunoglobulin and T cell receptor variable domain sequences.

The new version of IgBLAST is now available with three new features:

  • Determining the V gene reading frame from the end of FWR3 region instead of end of V gene. This is to allow proper determination of the frames for rearrangements that have insertions or deletions near the V gene end.

  • The packaging of the IgBlast standalone program and files has been modified to make it easier for users to install.

  • Increase allowed distance between V gene end and J gene start to 225 bp to allow detection of ultra long D/N region.

The release notes are available at https://ncbi.github.io/igblast/ https://ncbi.github.io/igblast/rel/Release-notes.html

The new executables are available on the NCBI FTP site at https://ftp.ncbi.nlm.nih.gov/blast/executables/igblast/release/LATEST

IgBLAST facilitates the analysis of immunoglobulin and T cell<br> receptor variable domain sequences. Read more here: https://ncbi.github.io/igblast/ and here: https://www.ncbi.nlm.nih.gov/pubmed/23671333

Fri, 22 Feb 2019

Are you identifying organisms? The 16S database may be your best choice.

For initial searches, the 16S database contains the data that most people need to identify organisms.

Using the 16S database will speed up your searches and provide you the results that you are most likely looking for. Please go to our new “How To” video to get more information (external site).

Mon, 28 Jan 2019

Understanding BLAST+ parameters

Having a basic understanding of BLAST+ parameters is essential to getting the results that meet your needs.

A recent Bioinformatics letter clears up some confusion and misunderstanding about how BLAST+ works. More information on this topic can be found in the BLAST+ documentation.