Have any questions? Please email us at ecoliwiki@gmail.com

Help:Frequently Asked Questions about Protein Pages

Jump to: navigation, search

FAQs about the Protein Pages

Why are there many identifiers in some cases for the same protein? (like GO or interpro)


It could be that multiple groups put the same protein into the databases & the databases, including GONUTS, are trying to consolidate them by providing all of the accessions for that protein.

What does "Aspect" mean again?


Aspect refers to which sub-ontology the term belongs to.

F = Molecular Function.

P = Biological Process.

C = Cellular Component.

Why are some rows in the Annotation table green while others have a white background?


The difference is the evidence code that is listed in the fifth column. All rows that have IEA:Inferred from Electronic Annotation have a green background. This evidence code is "used for annotations that depend directly on computation or automated transfer of annotations from a database, particularly when the analysis is performed internally and not published". They have not been verified by a human. Think of these as "hints", but remember they are not always 100% correct. Sometimes, programs predict GO terms that DO NOT apply based on some amount of similarity to another protein. Biocurators have to weed these out!

The rows with a white background have any other evidence code except IEA:Inferred from Electronic Annotation. The GO Consortium uses a number of evidence codes that are documented here, but CACAO biocurators may only use a subset of these and are listed here.

Some GO IDs are listed twice with two references, what causes this? Does it add to the weight of the assignment?


No, it doesn't really add weight. I only use IEAs as a "suggestion" or "hint" and sometimes they have useful GO terms. I think there are a number of groups that are contributing the IEAs, so I assume that is why they are duplicated. Some evidence codes support multiple references, but these evidence codes are not permitted for annotations made by CACAO students. We only want a single PMID in there. If the students find 2 papers that show the exact same thing, they can have 2 separate annotations. There are some weird conventions from the GO Consortium on the formatting of the references field, and having a single reference is one of them for all of the evidence codes students are permitted to use.

Several of the annotations have "GO_REF:0000004" or "GO_REF:0000023" in the "References" field. What do these mean?


These references are specific to the GO Consortium and designate where the annotation is coming from. For example, if you see "GO_REF:00000004", that means that the annotation was made based on keywords found in UniProt. There are a variety of GO_REFs and there is more documentation at the GO Consortium's site. For annotations that are being made from a peer-reviewed paper, you don't need to worry about these at all. Fill in the PMID for the paper in the References section.

When an annotation is listed with a "status" of complete, does that mean it can no longer be annotated or challenged?


The status field (the last column of the Annotation table) only tells users if the necessary fields have been filled in for the annotation to be considered complete by the GO Consortium. This means the annotation has a valid GO term, a reference and an evidence code (and the with/from field for some evidence codes). This isn't really significant to CACAO biocurators other than it is easy to look for "incomplete" annotations that are missing one or more of these fields.