BLAST is a tool for searching nucleotide or amino acid databases using nucleotide or amino acid sequences as a query. BLAST comes in different flavors (e.g. BLASTP, tBLASTX), allowing us to search different databases (nucleotide or amino acid) using different query types (amino acid or protein). BLAST is available as a web service on most sequence repositories, including NCBI and EBI.
In the context of Gene Ontology annotation, BLAST is used primarily to identify sequences similar to the sequence of interest (the query) we wish to annotate. Upon validating that the identified database sequence in the database meets the provided criteria to establish homology between the two sequences, experimental code-based annotations on the database sequence can be transferred to the sequence of interest (our query) citing the CACAO GO_REF as a source and using evidence codes ISS, ISA or ISO.
Several criteria are used to inspect and validate that alignments returned by BLAST are indicative of sequence homology and can be used to transfer existing GO annotations for a database hit to our sequence of interest. These criteria apply to overall BLAST results statistics (e.g. e-value), assessment of the quality and extension of the generated alignments and evaluation of the transferability of an existing annotation to the biological context of the sequence of interest. The database sequence used in the preexisting GO annotation must be referenced in the WITH field using an appropriate identifier.
The following links define the criteria to be used for transferring annotations under evidence codes ISS, ISA and ISO using different BLAST flavors: