BLAST programs¶
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J. (1990) “Basic local alignment search tool.” J. Mol. Biol. 215:403-410. PubMed
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J. (1997) “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.” Nucleic Acids Res. 25:3389-3402. PubMed
Zhang Z., Schwartz S., Wagner L., Miller W. (2000), “A greedy algorithm for aligning DNA sequences” J Comput Biol 2000; 7(1-2):203-14. PubMed
Morgulis A., Coulouris G., Raytselis Y., Madden T.L., Agarwala R., Schaffer A.A. (2008) “Database indexing for production MegaBLAST searches.” Bioinformatics 15:1757-1764. PubMed
Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Bealer K., Madden T.L. (2008) “BLAST+: architecture and applications.” BMC Bioinformatics 10:421. PubMed
Boratyn GM, Schaffer AA, Agarwala R, Altschul SF, Lipman DJ, Madden T.L. (2012) “Domain enhanced lookup time accelerated BLAST.” Biol Direct. 2012 Apr 17;7:12. PubMed
Boratyn GM, Thierry-Mieg J, Thierry-Mieg D, Busby B, Madden T.L. (2019) “Magic-BLAST, an accurate RNA-seq aligner for long and short reads.” BMC Bioinformatics. 2019 Jul 25;20(1):405. PubMed
Camacho C, Boratyn GM, Joukov V, Vera Alvarez R, Madden TL. ElasticBLAST: accelerating sequence search via cloud computing. BMC Bioinformatics. 2023 Mar 26;24(1):117. doi: 10.1186/s12859-023-05245-9. PMID: 36967390
Reviews, improvements and useful introductions¶
Altschul, S.F., Boguski, M.S., Gish, W., Wootton, J.C. (1994) “Issues in searching molecular sequence databases.” Nature Genet. 6:119-129. PubMed
McGinnis S., Madden T.L. (2004) “BLAST: at the core of a powerful and diverse set of sequence analysis tools.”Nucleic Acids Res. 32:W20-W25. PubMed
Ye J., McGinnis S, Madden T.L. (2006) “BLAST: improvements for better sequence analysis.”Nucleic Acids Res. 34:W6-W9. PubMed
Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden T.L. (2008) “NCBI BLAST: a better web interface” Nucleic Acids Res. 36:W5-W9. PubMed
Boratyn GM, Camacho C, Cooper PS, Coulouris G, Fong A, Ma N, Madden TL, Matten WT, McGinnis SD, Merezhuk Y, Raytselis Y, Sayers EW, Tao T, Ye J, Zaretskaya I. (2013) “BLAST: a more efficient report with usability improvements.”Nucleic Acids Res. 41:W29-W33. PubMed
Shiryev SA, Papadopoulos JS, Schaffer AA, Agarwala R. (2007) “Improved BLAST searches using longer words for protein seeding.”Bioinformatics 23(21):2949-51 PubMed
Madden, T.L., Busby B., Ye J. (2018) “Reply to the paper: Misunderstood parameters of NCBI BLAST impacts the correctness of bioinformatics workflows.”Bioinformatics. DOI: 10.1093/bioinformatics/bty1026. PubMed
Gish, W., States, D.J. (1993) “Identification of protein coding regions by database similarity search.” Nature Genet. 3:266-272. PubMed
Sequence filtering¶
Wootton, J.C., Federhen, S. (1996) “Analysis of compositionally biased regions in sequence databases.” Meth. Enzymol. 266:554-571. PubMed
Wootton, J.C., Federhen, S. (1993) “Statistics of local complexity in amino acid sequences and sequence databases.”Comput. Chem. 17:149-163.
Alignment scoring systems¶
Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C. (1978) “A model of evolutionary change in proteins.” In “Atlas of Protein Sequence and Structure, vol. 5, suppl. 3.” M.O. Dayhoff (ed.), pp. 345-352, Natl. Biomed. Res. Found., Washington, DC.
Schwartz, R.M., Dayhoff, M.O. (1978) “Matrices for detecting distant relationships.” In “Atlas of Protein Sequence and Structure, vol. 5, suppl. 3.” M.O. Dayhoff (ed.), pp. 353-358, Natl. Biomed. Res. Found., Washington, DC.
Altschul, S.F. (1991) “Amino acid substitution matrices from an information theoretic perspective.” J. Mol. Biol. 219:555-565. PubMed
States, D.J., Gish, W., Altschul, S.F. (1991) “Improved sensitivity of nucleic acid database searches using application-specific scoring matrices.” Methods 3:66-70.
Henikoff, S., Henikoff, J.G. (1992) “Amino acid substitution matrices from protein blocks.” Proc. Natl. Acad. Sci. USA 89:10915-10919. PubMed
Altschul, S.F. (1993) “A protein alignment scoring system sensitive at all evolutionary distances.” J. Mol. Evol. 36:290-300. PubMed
Alignment statistics¶
Altschul, S.F., Gish, W. (1996) “Local alignment statistics.” Meth. Enzymol. 266:460-480. PubMed
Karlin, S., Altschul, S.F. (1990) “Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.” Proc. Natl. Acad. Sci. USA 87:2264-2268. PubMed
Karlin, S., Altschul, S.F. (1993) “Applications and statistics for multiple high-scoring segments in molecular sequences.” Proc. Natl. Acad. Sci. USA 90:5873-5877. PubMed
Dembo, A., Karlin, S., Zeitouni, O. (1994) “Limit distribution of maximal non-aligned two-sequence segmental score.” Ann. Prob. 22:2022-2039.
Altschul, S.F. (1997) “Evaluating the statistical significance of multiple distinct local alignments.” In “Theoretical and Computational Methods in Genome Research.” (S. Suhai, ed.), pp. 1-14, Plenum, New York.
Schaffer AA, Aravind L, Madden TL, Shavirin S, Spouge JL, Wolf YI, Koonin EV, Altschul SF. (2001) “Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements.” Nucleic Acids Res. 2001 Jul 15;29(14):2994-3005. PubMed
Park Y, Sheetlin S, Ma N, Madden TL, Spouge JL. (2012) “New finite-size correction for local alignment score distributions.” BMC Res Notes. 2012 Jun 12;5:286. PubMed
Programs that use blast¶
Ye J, Coulouris G, Zaretskaya I, Cutcutache I, Rozen S, Madden TL. (2012) “Primer-BLAST: a tool to design target-specific primers for polymerase chain reaction.” BMC Bioinformatics 13:134. PubMed
Ye J, Ma N, Madden TL, Ostell JM. (2013) “IgBLAST: an immunoglobulin variable domain sequence analysis tool.” Nucleic Acids Res. 2013 Jul;41:W34-W40. PubMed
Papadopoulos JS, Agarwala R. (2007) “COBALT: constraint-based alignment tool for multiple protein sequences.” Bioinformatics 23(9):1073-9. PubMed