Ambox notice.png

GONUTS is under stress! The website is currently experiencing long-wait times and frequent time-outs due to the record number of students, groups, and annotations related to CACAO this semester. We are currently working on increasing performance -- please accept our apologies for the technical difficulties.

You can help reduce stress on the server by:

  1. not reloading pages frequently - this just adds
  2. opening links in new windows (so you can read the old page)

evidence codes

From GONUTS
Jump to: navigation, search

This page is to help CACAO students select the correct evidence code & then use that evidence code properly.

Consult the GO consortium evidence guide for the official documentation. This contains the most detailed information on ALL evidence codes, even the ones not permitted in CACAO.

Contents

Evidence Code Overview

  1. Experimental
  2. Computational
  3. Author statement
  4. Curator assigned
  5. Automatically assigned (IEA:Inferred from Electronic Annotation)

Experimental

  1. IDA: Inferred from Direct Assay
  2. IMP: Inferred from Mutant Phenotype
  3. IGI: Inferred from Genetic Interaction
  4. IEP: Inferred from Expression Pattern
Note: The Inferred in these codes indicates the authours came to some conclusion based on the evidence in the paper, not something the reader infers based on a combination with outside information, etc. Although this rule may have some exceptions, such as gleaning GO:0005624 membrane fraction from a cell fractionation via centrifugation, most of the conclusions you will annotate will be directly stated by the authours.

IDA vs IMP

IMP and regulation

Changes in the level or activity of protein Y when protein X is mutated can be used to infer that protein X regulates some aspect of protein Y. Note, however:

For example: "this shows that X regulates..." is definitive.
"this suggests that X could regulate..." is not definitive.

IGI

This paper is about how the authors needed to knockout both fabA & fadD genes in E. coli to disrupt its fatty acid biosynthesis. 
On the fabA protein page from E. coli page, this is what the final annotation looks like:
fabA.jpg


IEP

Process annotations from IEP evidence are a form of "guilt by association". A gene is implicated in a biological process when its expression pattern is similar to the pattern seen for other genes known to participate in the process. Because there are many indirect ways to affect levels of gene expression, ideally the pattern of co-expression should be robust under multiple conditions. An example of good inference of process from IEP evidence can be seen in the classic paper from Hughes et al. [1] where the patterns of expression of a set of yeast genes under many conditions were clustered.

The other common conclusion is a "response to..." term, such as GO:0006979 response to oxidative stress.

IEP vs IMP

If a mutation in gene X, such as a transcriptional repressor or activator changes the expression pattern of other genes, do not annotate gene X using IEP. Use IMP. IEP is for evidence about the expression of gene X itself.

IEP and regulation

Because IEP is used to for the expression pattern of the gene being annotated, it should not be used when a mutation in that gene changes the regulation of expression of other genes. Use IMP instead. See IMP and regulation, above.

Computational

Your choices for CACAO:

  1. ISA: Inferred from Sequence Alignment
  2. ISO: Inferred from Sequence Orthology
  3. ISM: Inferred from Sequence Model
  4. IGC: Inferred from Genomic Context

ISA

Example-Using ISA in a GO annotation:

BUT this other protein(s) must have an experiment-coded (IDA, IMP, IGI, IEP, etc... NOT IEA) annotation to cysteine protease activity (this IS the hard part)

Here is an actual example taken from Chua et al. (1988), where the authors are aligning DerP1 from the European Dust Mite to other known cysteine proteases:


ISA example.jpg






















If you want to make an ISA annotation on the DerP1 page, you have to find out if any of these other proteins have an annotation to the EXACT TERM based on an experiment. Happy hunting - here's step by step instructions:

 Step 1. Make sure your paper has a sequence alignment where the authours explicitly state there is a strong alignment
 Step 2: Go to UniProt & look up the accessions for each of these other proteins the authours state your gene is similar to
           rat cathepsin H = P00786
           no accession for Chinese Gooseberry actinidin
           Papaya papain = P00784  
           human cathepsin B = P07858
 Step 3: Click on each accession & scroll down to the section called "Ontologies"
 Step 4: Click on the link that says "Complete GO annotation..." for each protein.  
            (FYI, this is what GONUTS gets from UniProt when you make a page for a protein on GONUTS ).
 Step 5: Examine ALL entries on the gene page.   
   * If you see one that has the '''EXACT''' term you intend to annotate to, and has an acceptable evidence code (again, ''NOT IEA''), 
     then you can use that ONE Uniprot accession number in the with/from field.
   * If more than one of the aligned genes have acceptable (remember, '''EXACT''' and '''NOT IEA''') annotations, you can use those too.  
     Put all the acceptable, aligned genes in the same annotation. 

So
   * Rat has no molecular function terms that are annotated with experimental evidence codes except an IPI annotation to a binding term, 
      which CACAO students cannot use.  NO to using the rat protein
   * Chinese Gooseberry - no accession, thus no GO annotations.  NO to using the Chinese Gooseberry protein.
   * Papaya has no annotations except to IEA.  NO to using papaya protein.
   * Human Cathepsin B has an annotation to GO:008234 cysteine-type peptidase activity with IDA as the evidence code 
     & the reference PMID:7890620.  
   WE CAN USE THIS ONE!!!  THE UNIPROT ACCESSION FOR THIS PROTEIN (P07858) WILL GO IN THE WITH/FROM FOR OUR ANNOTATION ON THE DerP1 PAGE.

Untitled.jpg


ISO

ISO.jpg



ISM

Here is an example of correct usage - the annotation is on the E. coli MraY protein page
ISM.jpg










IGC

The GO consortium describes four situations where genomic context can be used to infer annotations

These require use of the with/from field, but note that the usage is different depending on the type of IGC evidence. For operons, pathways, or genome scale process analysis, the idea is that you can infer the function of the gene from which other genes are coexpressed or coinherited through evolution.

Thus, use one or more identifiers to specify the other genes in the operon, pathway, or process.

Useful Handouts

If you have not yet visited the helpful handouts for students page, these links may be helpful:

References

See Help:References for how to manage references in GONUTS.

  1. Hughes TR et al. (2000) Functional discovery via a compendium of expression profiles. Cell 102: 109-26 PubMed GONUTS page
Personal tools
Namespaces
Variants
Actions
Navigation
Cacao
Journal Clubs
page contributors
Toolbox