GONUTS has been updated to MW1.31 Most things seem to be working but be sure to report problems.
Compilation of GO term documentation
Contents
- 1 Places to look for information
- 2 Pages with GO term information excerpted from the above sources
- 3 Topics that need more information in GO Documentation
- 4 GODocumentation Meeting Minutes
Places to look for information
Gene Ontology Documentation
<http://geneontology.org/GO.contents.doc.shtml>
- Introduction
- Annotation Guide
- Evidence Code Guide
- Component Ontology
- Function Ontology
- Process Ontology
- File Format Guide
- GO Database Guide
- GO Slim Guide
Excerpts from the Gene Ontology Documention with GO term information are here: [1].
GO Consortium Meeting Minutes
<http://geneontology.org/GO.meetings.shtml>
Available GO Documentation
1. The minutes of various GO meetings: meetings.
April 2008
- Description of "regulates", Definition of the start and end of TCA cycle and glycolysis, Proteases, Response to drug, IEA/ISS & contributes_to
Princeton, Sept 23, 2008
- Taxon & Sensu, is_a and part_of, contributes_to, Discussion of experimental evidence code heirarchy, GO term for Lifecycle, Extensive discussion of evidence terms, How to use IGI & IMP
Cambridge, Jan 2007
- GO-CL cross-products, is_a, cellular_component & cell_part, Why unknowns were removed (Chris & Karen C.), WITH column for IPI evidence code, ISS & IEA discussion
- Midori points out that there is a SourceForge tracker item that contains a number of useful examples to incorporate into documentation. See SourceForge with examples.
- Action Item (Everyone): Use the word cellular to refer to things related to a cell but aren't necessarily cells or things contained within a cell. Use the word cell to refer to cells themselves or things contained within a cell. (Change cell signaling to cellular signaling, for example).
- Action Item: TAS is no longer considered a useful evidence code and will not be used in any consistency measures of reference genome annotation. Since part of the idea of the reference genomes is to provide a source of IEA annotations for other groups, we strongly encourage reference genome annotators to not use TAS, and instead use experimental evidence codes whenever possible. The GO documentation should also state this in a clear fashion. Inferred By Curator is usually a better way of encoding common-knowledge from a textbook than TAS.
- ISS & With/From field: Update the documentation on using the ISS evidence code to emphasize that annotators must enter something in the WITH field. In the case of gene products, there must be an experimental evidence code for that gene product which supports the annotation, i.e., we don't want to have circular ISS annotations. Uniprot IDs, efSeq IDs, or individual MOD gene IDs would be okay to use in the WITH column. Old ISS annotations that don't have an entry in the WITH column will not need to be retrofitted immediately.
- Different strains (same species) used for evidence? - Add to the documentation that it is okay to use experimental evidence codes for perfectly identical gene products from different strains of the same species.
- With /from field discussion: The ‘with’ column is just a union of all the things that were interacted with in that experiment. We should remove the special meaning of pipes as indicating complexes. If two genes are piped together it means they interact; if you use a comma, the interaction is ternary.
This issue elicited continued discussion and was unresolved. However, it became clear that we were in fact discussing several different problems, these being: a) What to put in the WITH field, all the pipe and comma discussion. (No longer an action item, see below) b) What is interaction data and is it something for the annotation file. c) The current use and definition of the "protein binding" function term.
- ISS, IEA & TAS - It's safe to not require a ‘with’ column with ISS with a new QC check: If you use ISS, you need to use a GO_REF. If you use a GO_REF, you need to put something in the ‘with’ column.
Possibilities: Leave with column blank if the ISS refers to an RNA, OR use a GO ref id, OR add a new evidence code, OR put the program name in the WITH column, OR just use RCA for these cases. Resolved: Always use a WITH column for IEA and ISS, containing a program name if necessary. For example, make a ref to tRNAscan. If an author says that that they used BLAST, but does not provide the accession for the match, then use TAS code, not ISS.
- "NOT" - When annotating NOT, if no positive annotation can be made, the gene product should ALSO be annotated to the root.
- Cellular - Use the word cellular to refer to things related to a cell but aren't necessarily cells or things contained within a cell. Use the word cell to refer to cells themselves or things contained within a cell. (Change cell signaling to cellular signaling, for example).
Salt Lake City, Utah, USA, April 22 - 23, 2008
- If regulation of X was part_of X then all that need to be changed was to convert to regulates. BP terms reviewed, some added as part_of relations (many new definitions and synonyms also added). Reasoner – used to identify inconsistencies between regulate terms and BP terms, missing terms.
- MF can regulate MF, BP can regulate BP BUT ALSO MF can regulate BP (an vice versa) Will change MF from being a strict is_a ontology
- how to make relationship between ‘transcription regulator activity’ in MF to either ‘transcription’ as process with regulations
- Metabolism – intermediate regulation parents filled in manually in one section; moving towards computational analysis to find these terms for larger areas of the ontology followed by biocurator review before adding new terms.
- Signal transduction
- ‘ultimately effecting a change in the functioning of a cell’ – sounds like regulation activity to Tanya…could make a regulation process…
- can say that signal transduction is_a regulation of cell communication. Will be starting a push to revamp signal transduction: need definitions with a start point and an end point. Beginning of signal transduction is ligand-receptor binding then the rest is a cellular process.
- Considerations on Glycolysis and TCA Cycle
- First thing is to define start and end points for glycolysis. Made stop point at pyruvate so that didn’t have to consider aerobic/anaerobic processes - same process of defining start and end points for the Kreb cycle.
- Examine terms & definitions within the defined start & end of the process - find discrepancies, possibly missing terms. Looking at definitions and refining/adding. Looking at the comparable Reactome data.
- reaction can be slightly different depending on the outcome. Should we look at the pathway as the common element for all processes or should we look at these as separate processes that have separate purposes (with overlap). The latter will become enormously complex as we start considering diverse species.
- Historically, BPs are defined by their objective, therefore it makes sense to take the latter view. Will require parent terms to collect these processes. IF different gene products are involved, it should be defined as a different process.
- Electron Transport Cross-products
- Looking at electron transport region and representing BP & MF cross-products.
Developed lists of MFs, their BP and a taxonomy group that it applies to Used this to make has_part relationships: BP can’t exist without MF in this taxon Put has_part relationship to put BP term under MF term (at the moment there are difficulties visualizing this). Taxon issue – how do we represent that these relationships should be qualified by taxon - possible to have general parent term that has several children term to represent the different taxons. In order to create links from BP and MF are going to need the new relationship, has_part. At the moment, need to come up with a way of adding this relationship in to the graphs (violates expectations – makes syntactical sense but is harder to understand for biologists).
- Cross Products between GO BP & CC
- To capture what is happening with the BP, need a number of new relationships:
- Results_in_structural_change_to
- Results_in_formation_of
- Results_in_connection_of
- To capture what is happening with the BP, need a number of new relationships:
- Proteases
- reorganization of MF relating to protease activity? Proposal on wiki for review.
- NOTE: many protease terms will become obsolete on the basis that they are gene products rather than functions.
- Proteases can be distinguished generically by structure (1) active sites (4 different types) and (2) by where in the target peptide they cut (3 types). This would give us 12 MF terms. Would this cover all proteases?
- when do MF terms stop representing classes and start representing individual genes? Specific gene products should not be represented
- there are some MF terms that have clear differentiate from other MF terms. These terms should be used
- reorganization of MF relating to protease activity? Proposal on wiki for review.
- Localization
- localization has two parts:
- Establishment of localization
- Maintenance of Localization: make regulation of biological quality doesn’t mean maintaining the process of moving something to location, but rather Maintenance of Locations
- change to maintenance of location and add parentage under 'regulation of biological quality'
- response to drug
- SF 1242405 and 1494526 and 'response to toxin' SF1658374. Are they normal biological processes? Response to X (drug, toxin) Response to Chemical (see also Use of Response To Terms in Annotation, for a related issue)(David and Tanya)
- We keep ‘response to drug’, add response to ‘XX’ response by role chemical is playing would be in ontology with co annotation to chemical. We keep stub of ‘response to drug’, no children. Roles are basically stub terms in ontology
- Transport of drug
- Degradation of drug
- sometimes things are toxins, sometimes not; same as drug similiar to situation in PAMGO with ‘pathogenesis’, not always deleterious Drugs would be put in the chemical organization.
- Annotate to chemical term ‘response to cocaine’, co-annotate with chemical term for now, then later when available, put GO ID for “response to drug’ in column 16 (or separate IC annotation).
- relationship between pigment metabolic process and pigmentation
- contributes_to with IDA.
- Only for IDA?
- Look into adding to annotation checking script to flag contributes_to.
- Val will circulate draft doc on how contributes_to can & can't be used; will include: "Would this annotation make sense if this subunit was" ... [thought not finished; might be something like "viewed by itself"]
Princeton, NJ, USA, September 23 - 24, 2007
- See Is_a summary
- For original notes, See Is_a general notes
Jesus College, Cambridge, UK, January 8 - 10, 2007
St. Croix, USVI, March 31 - April 2, 2006
- What does function mean? To say that an entity has a biological function means that it's part of an organism and has a propensity to act reliably to contribute to survival. A better definition would be: function means it's part of an organism and has a disposition to act reliably in such a way as to contribute to the organism's canonical life plan.
Does this exclude the idea of abnormal? What is canonical vs. variance vs. pathological? There are biological functions and there are molecular functions? Are all molecular functions biological functions?
- Pasadena, CA, April 8-9, 2005
- TEXT
- Progress Reports
- Chicago, IL, October 15-17, 2004
- TEXT
- Progress Reports
- Stanford, CA, January 15-17, 2004
- TEXT
- Progress Reports
- Bar Harbor, ME, September 26-27, 2003
- TEXT
- Progress Reports
- TIGR, US, June 3-4, 2003
- TEXT
- St Croix, US Virgin Islands, January 25-26, 2003
- TEXT
- Cambridge, UK, September 10-11, 2002
- TEXT
- Cold Spring Harbor Laboratory, NY, May 12-13, 2002
- TEXT
- Tucson, AZ, February 2-3, 2002
- TEXT
- Chicago, IL, October 13-14, 2001
- TEXT
- Bar Harbor, ME, July 14-15, 2001
- TEXT
- Carnegie, Stanford, CA, March 4-5, 2001
- TEXT
- GO and annotation of human genes; CSHL, December 10-12, 2000
- TEXT
- Lawrence Berkeley National Laboratory, November 5-6, 2000
- Note: The minutes are also available as a single text file containing the minutes from all of the meetings, joined end-to-end, in date order. This file is provided to allow quicker searching of the total set.
SourceForge
Mailing List
RefGenome Annotation Project on GO Public wiki
Quiz Midori
Pages with GO term information excerpted from the above sources
- Excerpts from the Gene Ontology Documention with GO term information are here: [2].
- Membrane terms as of 2008-11-20 are here: [3]
- Is_a General Notes are here: Is_a Summary Page
- Commonly misused terms from the RefGenome Project are here: [4]
- Page with obsoleted GO terms: [5]
Topics that need more information in GO Documentation
- Relationship terms
- is_a and part_of
- "Maintaining complete is_a and part_of trees in cellular component" This section needs more explanation about the relationship terms. I suggest adding a link to a page on relationship terms and their usage.
- is_a and part_of
- Regulates