BLAST programs

Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J. (1990) “Basic local alignment search tool.” J. Mol. Biol. 215:403-410. PubMed

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J. (1997) “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.” Nucleic Acids Res. 25:3389-3402. PubMed

Zhang Z., Schwartz S., Wagner L., Miller W. (2000), “A greedy algorithm for aligning DNA sequences” J Comput Biol 2000; 7(1-2):203-14. PubMed

Morgulis A., Coulouris G., Raytselis Y., Madden T.L., Agarwala R., Schaffer A.A. (2008) “Database indexing for production MegaBLAST searches.” Bioinformatics 15:1757-1764. PubMed

Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., Madden T.L. (2008) “BLAST+: architecture and applications.” BMC Bioinformatics 10:421. PubMed

Boratyn GM, Schaffer AA, Agarwala R, Altschul SF, Lipman DJ, Madden T.L. (2012) “Domain enhanced lookup time accelerated BLAST.” Biol Direct. 2012 Apr 17;7:12. PubMed

Boratyn GM, Thierry-Mieg J, Thierry-Mieg D, Busby B, Madden T.L. (2019) “Magic-BLAST, an accurate RNA-seq aligner for long and short reads.” BMC Bioinformatics. 2019 Jul 25;20(1):405. PubMed

Camacho C, Boratyn GM, Joukov V, Vera Alvarez R, Madden TL. ElasticBLAST: accelerating sequence search via cloud computing. BMC Bioinformatics. 2023 Mar 26;24(1):117. doi: 10.1186/s12859-023-05245-9. PMID: 36967390

Reviews, improvements and useful introductions

Altschul, S.F., Boguski, M.S., Gish, W., Wootton, J.C. (1994) “Issues in searching molecular sequence databases.” Nature Genet. 6:119-129. PubMed

McGinnis S., Madden T.L. (2004) “BLAST: at the core of a powerful and diverse set of sequence analysis tools.”Nucleic Acids Res. 32:W20-W25. PubMed

Ye J., McGinnis S, Madden T.L. (2006) “BLAST: improvements for better sequence analysis.”Nucleic Acids Res. 34:W6-W9. PubMed

Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden T.L. (2008) “NCBI BLAST: a better web interface” Nucleic Acids Res. 36:W5-W9. PubMed

Boratyn GM, Camacho C, Cooper PS, Coulouris G, Fong A, Ma N, Madden TL, Matten WT, McGinnis SD, Merezhuk Y, Raytselis Y, Sayers EW, Tao T, Ye J, Zaretskaya I. (2013) “BLAST: a more efficient report with usability improvements.”Nucleic Acids Res. 41:W29-W33. PubMed

Shiryev SA, Papadopoulos JS, Schaffer AA, Agarwala R. (2007) “Improved BLAST searches using longer words for protein seeding.”Bioinformatics 23(21):2949-51 PubMed

Madden, T.L., Busby B., Ye J. (2018) “Reply to the paper: Misunderstood parameters of NCBI BLAST impacts the correctness of bioinformatics workflows.”Bioinformatics. DOI: 10.1093/bioinformatics/bty1026. PubMed

Gish, W., States, D.J. (1993) “Identification of protein coding regions by database similarity search.” Nature Genet. 3:266-272. PubMed

Sequence filtering

Wootton, J.C., Federhen, S. (1996) “Analysis of compositionally biased regions in sequence databases.” Meth. Enzymol. 266:554-571. PubMed

Wootton, J.C., Federhen, S. (1993) “Statistics of local complexity in amino acid sequences and sequence databases.”Comput. Chem. 17:149-163.

Alignment scoring systems

Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C. (1978) “A model of evolutionary change in proteins.” In “Atlas of Protein Sequence and Structure, vol. 5, suppl. 3.” M.O. Dayhoff (ed.), pp. 345-352, Natl. Biomed. Res. Found., Washington, DC.

Schwartz, R.M., Dayhoff, M.O. (1978) “Matrices for detecting distant relationships.” In “Atlas of Protein Sequence and Structure, vol. 5, suppl. 3.” M.O. Dayhoff (ed.), pp. 353-358, Natl. Biomed. Res. Found., Washington, DC.

Altschul, S.F. (1991) “Amino acid substitution matrices from an information theoretic perspective.” J. Mol. Biol. 219:555-565. PubMed

States, D.J., Gish, W., Altschul, S.F. (1991) “Improved sensitivity of nucleic acid database searches using application-specific scoring matrices.” Methods 3:66-70.

Henikoff, S., Henikoff, J.G. (1992) “Amino acid substitution matrices from protein blocks.” Proc. Natl. Acad. Sci. USA 89:10915-10919. PubMed

Altschul, S.F. (1993) “A protein alignment scoring system sensitive at all evolutionary distances.” J. Mol. Evol. 36:290-300. PubMed

Alignment statistics

Altschul, S.F., Gish, W. (1996) “Local alignment statistics.” Meth. Enzymol. 266:460-480. PubMed

Karlin, S., Altschul, S.F. (1990) “Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.” Proc. Natl. Acad. Sci. USA 87:2264-2268. PubMed

Karlin, S., Altschul, S.F. (1993) “Applications and statistics for multiple high-scoring segments in molecular sequences.” Proc. Natl. Acad. Sci. USA 90:5873-5877. PubMed

Dembo, A., Karlin, S., Zeitouni, O. (1994) “Limit distribution of maximal non-aligned two-sequence segmental score.” Ann. Prob. 22:2022-2039.

Altschul, S.F. (1997) “Evaluating the statistical significance of multiple distinct local alignments.” In “Theoretical and Computational Methods in Genome Research.” (S. Suhai, ed.), pp. 1-14, Plenum, New York.

Schaffer AA, Aravind L, Madden TL, Shavirin S, Spouge JL, Wolf YI, Koonin EV, Altschul SF. (2001) “Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements.” Nucleic Acids Res. 2001 Jul 15;29(14):2994-3005. PubMed

Park Y, Sheetlin S, Ma N, Madden TL, Spouge JL. (2012) “New finite-size correction for local alignment score distributions.” BMC Res Notes. 2012 Jun 12;5:286. PubMed

Programs that use blast

Ye J, Coulouris G, Zaretskaya I, Cutcutache I, Rozen S, Madden TL. (2012) “Primer-BLAST: a tool to design target-specific primers for polymerase chain reaction.” BMC Bioinformatics 13:134. PubMed

Ye J, Ma N, Madden TL, Ostell JM. (2013) “IgBLAST: an immunoglobulin variable domain sequence analysis tool.” Nucleic Acids Res. 2013 Jul;41:W34-W40. PubMed

Papadopoulos JS, Agarwala R. (2007) “COBALT: constraint-based alignment tool for multiple protein sequences.” Bioinformatics 23(9):1073-9. PubMed