Gene Ontology project is established to standardized describe the gene product’s functional information and help the biologist answer the specific queries about gene function. The formal vocabulary about gene functions and products promotes a high level of annotation review integration from various databases.
Breaking: A plethora of tools linking the Gene Ontology knowledgebase exist to annotate and visualize the genes function.
Introduction: WEGO (Web Gene Ontology Annotation Plot), created in 2006, is a simple but useful tool for visualizing, comparing and plotting GO (Gene Ontology) annotation results. WEGO uses the GO annotation results as input. Based on GO’s standardized DAG (Directed Acyclic Graph) structured vocabulary system, the number of genes corresponding to each GO ID is calculated and shown in a graphical format. WEGO provide multiple datasets analysis. Also added are the reference datasets of nine model species that can be adopted as baselines in genomic comparative analyses.
Web server: WEGO v2.0.
Introduction: g:GOSt (part of the g:Profiler) performs functional enrichment analysis, also known as over-representation analysis (ORA) or gene set enrichment analysis, on input gene list. It maps genes to known functional information sources and detects statistically significantly enriched terms. We regularly retrieve data from Ensembl database and fungi, plants or metazoa specific versions of Ensembl Genomes, and parasite specific data from WormBase ParaSite. In addition to Gene Ontology, we include pathways from KEGG Reactome and WikiPathways; miRNA targets from miRTarBase and regulatory motif matches from TRANSFAC; tissue specificity from Human Protein Atlas; protein complexes from CORUM and human disease phenotypes from Human Phenotype Ontology. g:GOSt supports close to 500 organisms and accepts hundreds of identifier types.
Installation: To install this package, start R and enter:
install.packages("gprofiler2")
library(gprofiler2)
gostres <- gost(query = c("X:1000:1000000", "rs17396340", "GO:0005005", "ENSG00000156103", "NLRP1"),
organism = "hsapiens")
# The result is a named list where “result” is a data.frame with the enrichment analysis results
# and “meta” containing a named list with all the metadata for the query.
head(gostres$result)
## query significant p_value term_size query_size intersection_size
## 1 query_1 TRUE 4.995845e-02 1 3 1
## 2 query_1 TRUE 1.810899e-35 52 22 16
## 3 query_1 TRUE 5.386811e-24 239 22 16
## 4 query_1 TRUE 5.770021e-24 240 22 16
## 5 query_1 TRUE 1.211639e-19 442 22 16
## 6 query_1 TRUE 6.395424e-19 490 22 16
## precision recall term_id source term_name
## 1 0.3333333 1.00000000 CORUM:6180 CORUM PPP2R1A-PPP2R3B complex
## 2 0.7272727 0.30769231 GO:0048013 GO:BP ephrin receptor signaling pathway
## 3 0.7272727 0.06694561 GO:0007411 GO:BP axon guidance
## 4 0.7272727 0.06666667 GO:0097485 GO:BP neuron projection guidance
## 5 0.7272727 0.03619910 GO:0007409 GO:BP axonogenesis
## 6 0.7272727 0.03265306 GO:0061564 GO:BP axon development
## effective_domain_size source_order parents
## 1 3385 1952 CORUM:0000000
## 2 21092 14472 GO:0007169
## 3 21092 3281 GO:0007409, GO:0097485
## 4 21092 21828 GO:0006935, GO:0031175, GO:0048812
## 5 21092 3280 GO:0048667, GO:0048812, GO:0061564
## 6 21092 18235 GO:0031175
The package can be used to plot the enrichment results.
p <- gostplot(gostres, capped = FALSE, interactive = FALSE)
p
Web server: g:GOSt.
Introduction: AgriGO v2.0 is a web-based tool and database for gene ontology analyses. It specifically focuses on agricultural species and is user-friendly. AgriGO v2.0 is designed to provide deep support to the agricultural community in the realm of ontology analyses. New advantages and features of agriGO v2.0 are as follows: 1). The agriGO v2.0 focuses on agricultural species in particular. It supports species and datatypes. 2). A new species’ classification system, single species analysis and reference datatype priorities help users to perform fast and accurate analyses. 3). Analysis tools, including the Singular Enrichment Analysis (SEA), Parametric Analysis of Gene set Enrichment (PAGE), BLAST4ID (Transfer IDs by BLAST) and SEACOMPARE (Cross comparison of SEA) were retained. These tools provide users with means for data mining and systematic result exploration and will allow better data analyses and interpretations. 4). Custom analysis tools including custom direct acyclic graph (DAG) tree and Scatter Plot were developed. These tools increase input flexibility. 5). A Batch SEA tool of multiple inputs, such as time-course samples, was provided, as well as the distributions of the p-values (PVD) of the significant GO terms randomly generated.
Category | Classification | Species counts |
---|---|---|
Plant | Brassicaceae | 12 |
Poaceae | 29 | |
Malvaceae | 6 | |
Fabaceae | 16 | |
Solanaceae | 12 | |
Tree | 29 | |
Algae | 18 | |
Animal | Fish | 20 |
Aves | 11 | |
Amphibia | 3 | |
Insecta | 56 | |
Mammalia | 58 | |
Fungi | Sordariomycetes | 5 |
Web server: agriGO.
Introduction: The UniProt Knowledgebase (UniProtKB) is the central hub for the collection of functional information on proteins, with accurate, consistent and rich annotation. In addition to capturing the core data mandatory for each UniProtKB entry (mainly, the amino acid sequence, protein name or description, taxonomic data and citation information), as much annotation information as possible is added. This includes widely accepted biological ontologies, classifications and cross-references, and clear indications of the quality of annotation in the form of evidence attribution of experimental and computational data.
The UniProt Knowledgebase consists of two sections: a section containing manually-annotated records with information extracted from literature and curator-evaluated computational analysis, and a section with computationally analyzed records that await full manual annotation. For the sake of continuity and name recognition, the two sections are referred to as “UniProtKB/Swiss-Prot” (reviewed, manually annotated) and “UniProtKB/TrEMBL” (unreviewed, automatically annotated), respectively.
SPECIES: 250
Web server: UniProtKB.
Introduction: The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System was designed to classify proteins (and their genes) in order to facilitate high-throughput analysis. The core of PANTHER is a comprehensive, annotated “library” of gene family phylogenetic trees. All nodes in the tree have persistent identifiers that are maintained between versions of PANTHER, providing a stable substrate for annotations of protein properties like subfamily and function. Each phylogenetic tree is used to annotate each protein member of the family by its: Family and Protein Class (supergrouping of protein families) Subfamily (subgroup within the family phylogenetic tree) Orthologs (genes in other organisms that derive from the same gene in the MRCA) Paralogs (genes in the same organism that are related by gene duplication) Function (using GO terms annotated on the trees by the GO Phylogenetic Annotation Project) Pathways (curated by PANTHER and by Reactome)
Type | Number |
---|---|
species | 143 |
pathways | 177 |
Ontologies | 3361 terms |
2267 biological process terms | |
544 cellular component terms | |
550 molecular function terms |
Web server: PANTHER.
Introduction: GOrilla is a web-based application that identifies enriched GO terms in ranked lists of genes, without requiring the user to provide explicit target and background sets. This is particularly useful in many typical cases where genomic data may be naturally represented as a ranked list of genes (e.g. by level of expression or of differential expression).GOrilla employs a flexible threshold statistical approach to discover GO terms that are significantly enriched at the top of a ranked gene list.
Web server: GOrilla.
Introduction: WEGO uses the GO annotation results as input. Based on GO’s standardized DAG (Directed Acyclic Graph) structured vocabulary system, the number of genes corresponding to each GO ID is calculated and shown in a graphical format. WEGO 2.0 updates have targeted four aspects, aiming to provide a more efficient and upto-date approach for comparative genomic analyses.
Web server: WEGO 2.0.
Introduction: Gene Ontology Enrichment Analysis Software Toolkit (GOEAST), an easy-to-use web-based toolkit that identifies statistically overrepresented GO terms within given gene sets. Compared with available GO analysis tools, GOEAST has the following improved features: (i) GOEAST displays enriched GO terms in graphical format according to their relationships in the hierarchical tree of each GO category (biological process, molecular function and cellular component), therefore, provides better understanding of the correlations among enriched GO terms; (ii) GOEAST supports analysis for data from various sources (probe or probe set IDs of Affymetrix, Illumina, Agilent or customized microarrays, as well as different gene identifiers) and multiple species (about 60 prokaryote and eukaryote species); (iii) One unique feature of GOEAST is to allow cross comparison of the GO enrichment status of multiple experiments to identify functional correlations among them.
Web server: GOEAST.
Introduction: ShinyGO based on a large annotation database derived from Ensembl and STRING-db for 59 plant, 256 animal, 115 archeal and 1678 bacterial species. ShinyGO’s novel features include graphical visualization of enrichment results and gene characteristics, and application program interface access to KEGG and STRING for the retrieval of pathway diagrams and protein–protein interaction networks.
Web server: ShinyGO.
Introduction: GOTermFinder comprises a set of objectoriented Perl modules for accessing Gene Ontology (GO) information and evaluating and visualizing the collective annotation of a list of genes to GO terms. It can be used to draw conclusions from microarray and other biological data, calculating the statistical significance of each annotation. GO::TermFinder can be used on any system on which Perl can be run, either as a command line application, in single or batch mode, or as a web-based CGI script.
Web server: GOTermFinder.