Tutorials

This section presents some ElasticBLAST searches that you can try out. It assumes that you have read the Overview and completed either the Quickstart for GCP or the Quickstart for AWS.

You will be able to run these examples on either AWS or GCP. For most of these examples, you will need to write a configuration file. Below we describe how to write your file and the differences between files for GCP and AWS.

View these examples as suggestions. Once you are confident you understand how ElasticBLAST works, you can start modifying the examples. You can use a local FASTA file, change the database or change the formatting options.

In the first part of this section, we provide information that will help you to complete the examples listed here.

Environment

It is possible to run these examples using the Cloud Shell as was done with the GCP and AWS quickstarts. On the other hand, there are advantages to using your own hardware or a cloud instance that you have started. Some advantages to using your own hardware or cloud instance are increased disk space and more processing power, allowing you to better make use of ElasticBLAST as part of a pipeline. If you will be using your own hardware or a cloud instance, you should review the Requirements. You should also look at the section below on Providing Credentials.

Configuration Files

Below is the configuration file used in the AWS quickstart (its GCP equivalent is similar). This file contains three sections: cloud-provider, cluster, and blast. Each section contains a number of configuration variables (key/value pairs). These are defined in Configuration variables. Here are some changes you may need or want to make:

cloud-provider

You will be able to use the same configuration variables used in the quickstart, assuming the same account. If you will be changing any part of this section, please refer to GCP Configuration or AWS Configuration.

cluster

You do not need to change the configuration variables in this section, though you may want to change the number of machines (num-nodes). In this section, you can also add a use-preemptible = yes key/value pair to indicate that you want to use a less expensive preemptible (GCP) or spot (AWS) instance. See Use preemptible nodes for details. ElasticBLAST will select an appropriate machine type with sufficient memory for your database. You may override this feature and specify a machine-type in this section, but that is not recommended. See Machine type for information on the default machine types and how to select a different machine type.

blast

You will need to edit the configuration variables in this section in order to accomplish your goal. You can provide BLAST+ specific parameters in this section such as the program, database and other BLAST+ command-line parameters. See BLAST Configuration Options for details.

[cloud-provider]
aws-region = us-east-1

[cluster]
num-nodes = 2

[blast]
program = blastp
db = refseq_protein
queries = s3://elasticblast-test/queries/BDQA01.1.fsa_aa
results = s3://elasticblast-YOURNAME/results/BDQA
options = -task blastp-fast -evalue 0.01 -outfmt "7 std sskingdoms ssciname"

Providing Credentials

If you are not using the cloud shell, you will need to provide credentials if you have not already done so.

Read about providing credentials for GCP here.

For AWS, you can configure access via any of the ways listed in the AWS CLI configuration documentation. If working with an AWS EC2 instance, you can also use AWS IAM roles to grant permissions (see also Required IAM Permissions).

Tutorials