GONUTS has been updated to MW1.31 Most things seem to be working but be sure to report problems.
- 1 Picking the wrong kind of paper or experiment
- 2 Using the wrong protein
- 3 Using the wrong GO term
- 4 Reference errors
- 5 Using the wrong evidence
- 6 Bad notes
- 7 Bad challenges
Picking the wrong kind of paper or experiment
Experiments that use the existing annotations
- Experiments that identify protein interactions or genes that are expressed under some condition often list the proteins they found and give the known functions for those proteins. This is NOT experimental evidence showint that those proteins have the functions listed!
- Computer models that assume the function of a protein for doing a simulation are not providing evidence that the protein has that function.
Beware of expression results
CACAO normally does not allow annotations with the IEP:Inferred from expression pattern evidence code. This is because being expressed while a process is happening is not the same as participating in the process.
Beware of disease papers
Remember that GO annotation is about recording inferences about the normal function of a protein. Papers about disease states can be problematic to the extent that studying the role of a gene in a disease may or may not provide information about the normal function of the gene.
Using the wrong protein
- Make sure you have the right species
- Do NOT annotate the MOD gene pages: these have names that start with
- These are examples of how to annotate that come from specific model organism databases. Your annotations via UniProt should eventually make it back to those databases, but we should let their biocurators handle that. We plan to block modification of those tables, but haven't done it yet.
- Sometimes searching for the gene name will miss a UniProt record that is there. If you can find an accession, either in the paper you are curating or by going back into the references, you can be more sure that you have the right record.
- If there are UniProt records for the same species but not the specific substrain, see Help:CACAO Choosing UniProt records for a species
- Beware of fragment records! Some UniProt records are based on incomplete sequences that are in other databases.
Using the wrong GO term
We are gradually building up usage notes for GO terms, so be sure to see if the term you are considering has a usage note.
GO terms that should not be used for annotation
In addition to the terms that are not valid for CACAO, there are terms that should not be used at all
- Terms in Category:GO:gocheck do not annotate are there to have umbrella parent terms for groups of terms that can be used for annotation.
- Terms in Category:GO:Obsolete are terms that have been deprecated in GO. We keep GONUTS pages for them so you can see the suggested alternatives.
We plan to block these automatically, but that isn't implemented yet.
Component vs Process
- Children of GO:0008104 ! protein localization are for gene products that actively help proteins to the appropriate location. For example, the KDEL receptor recognizes proteins with the KDEL tag and retains them in the endoplasmic reticulum. KDEL receptor should be annotated to a child of GO:0070972 ! protein localization to endoplasmic reticulum. Proteins containing the KDEL tag that are held in the ER by the KDEL receptor should be annotated to the component term GO:0005783 ! endoplasmic reticulum.
Processes and Regulation of Processes
- If knocking something out decreases a process, but the process still happens, consider a regulation term
- If you use a regulation term, be specific if possible! Consider whether the experiment shows positive or negative regulation.
- When annotating a process involving multiple organisms, be sure to use the appropriate multiorganism process term. For example, a protein affecting the ability of bacterial cells to stick to host cells would use GO:0044650_!_adhesion_of_symbiont_to_host_cell or one of its children, instead of GO:0016337_!_cell-cell_adhesion or its single-organism children.
- Be sure you know the difference between a Pubmed ID (PMID) and a PMC ID. One way to check: when you enter your annotation on the Gene page, your annotation should have a citation link to the references section. If the link doesn't appear, or the listed paper doesn't match what you are working on, you need to fix the reference.
Using the wrong evidence
- Sometimes choosing between IDA and IMP is not obvious. Easy things to miss
- Using inhibitors is IMP. The idea is that you are making a phenocopy of a knockout mutation.
- Chemicals/drugs that activate are IDA
- Overexpressing a protein to show its normal function is IDA
- Overexpressing a protein to give abnormal function is IMP. This can happen, for example, when overexpression causes a multi-protein complex to form improperly.
- The note must be sufficient to explain why you are annotating this gene product to this GO term using the evidence you chose. Sometimes just saying a figure or table is enough. But other times it isn't. If we can't figure it out very quickly, we will mark it as incorrect.
- If you copy and paste text from the papers, put it inside quotation marks.
The CACAO judges often mark annotations while the annotation and challenge innings are going on. If we have already put an x by one of the criteria, you are wasting our time if you submit a challenge that just says that the original annotation had a problem with that part of the annotation. To get points, you need to either find something we haven't already seen, explain how to fix the annotation, or explain why it is not fixable.