Tips for GCP¶
How to easily try elastic-blast on GCP?¶
To try elastic-blast with relatively small input size (less than 10k
residues, or less than 100k bases), run
elastic-blast from the GCP cloud shell.
You can access it on your web browser at https://console.cloud.google.com/?cloudshell=true
or via the
gcloud alpha cloud-shell ssh --ssh-flag=-A
How to install dependencies on Debian/Ubuntu machines?¶
If you are working on Debian or Ubuntu Linux distribution and have
root permissions, you can install kubectl and python-distutils as follows:
sudo apt-get -yqm update sudo apt-get install -yq kubectl python3-distutils
Using the Free Trial at GCP¶
GCP has a Free Trial for new users (https://cloud.google.com/free). The Free Trial comes with some restrictions that are important for ElasticBLAST users. These include only being able to run eight cores concurrently and limiting the persistent disk size to 250G (https://cloud.google.com/terms/free-trial). Normally, ElasticBLAST would run more than eight cores at a time and the default persistent disk size is 3000G.
You should be able to run ElasticBLAST under the Free Trial following the instructions at Quickstart for GCP, but you will need to modify the configuration file to use fewer resources. You may not be able to use the cloud shell and the instance suggested below as that may exceed the quota on cores allowed at one time. In that case, you will need to submit your ElasticBLAST search from your own computer.
For additional details about GCP’s free tier (duration, products included, etc), please visit https://cloud.google.com/free/docs/gcp-free-tier .
Below is a configuration file that should work under the Free Trial as of January 2022. This file has been modified from the one in Quickstart for GCP in the following ways:
n1-highmem-8, with 8 CPUs has been specified. Normally, ElasticBLAST automatically sets the machine type based on the size of the database and the program.
A persistent disk (
pd-size) with 200G has been specified.
1[cloud-provider] 2gcp-region = us-east4 3gcp-zone = us-east4-b 4 5 6[cluster] 7num-nodes = 1 8labels = owner=USER 9machine-type = n1-highmem-8 10pd-size = 200G 11 12[blast] 13program = blastp 14db = swissprot 15queries = gs://elastic-blast-samples/queries/protein/BDQA01.1.fsa_aa 16results = gs://elasticblast-USER/results/BDQA 17options = -task blastp-fast -evalue 0.01 -outfmt "7 std sskingdoms ssciname"