Pathways are biological networks defining how biomolecules cooperate to accomplish cellular tasks in different conditions. Pathways are assembled from physically interacting molecules such as proteins and thus, it is particularly important to the annotation of those proteins in the pathway.
Breaking: Thirty pathway academics databases are summarized in this document.
Introduction: KEGG PATHWAY is a collection of manually drawn pathway maps representing the current knowledge of the molecular interaction, reaction and relation networks for: 1). Metabolism 2). Genetic Information Processing 3). Environmental Information Processing 4). Cellular Processes 5). Organismal Systems 6). Human Diseases 7). Drug Development.
Type | Number |
---|---|
species | 8504(813 eukaryotes,7291 bacteria, 400 archaea) |
pathway | 559 (979,624 entries) |
orthology | 25,479 entries |
genome | 20,348 entries |
genes | 44,226,835 entries |
compound | 19,007 entries |
glycan | 11,104 entries |
reaction | 11,841 entries |
rclass | 3,180 entries |
enzyme | 8,012 entries |
network | 1,427 entries |
variant | 738 entries |
disease | 2,599 entries |
drug | 11,993 entries |
Web server: KEGG.
Introduction: The goal of the Pathway Ontology is to cover all types of biological pathways, including altered and disease pathways (2643 pathways), and to capture the relationships between them within the hierarchical structure of a Directed Acyclic Graph (DAG). The five nodes of the ontology are: classic metabolic, regulatory, signaling, drug and disease pathways.
Web server: Pathway ontology.
Introduction: BioCarta is a database of gene interaction models. The database contains high-quality images of several cellular signaling and interaction pathways, and each diagram is fully hyperlinked to products and information pages about individual genes. Users can access product sales pages for selected elements of each pathway. state: 1396 genes 254 pathways 4417 gene-pathway associations
Web server: Biocarta.
Introduction: REACTOME is an open-source, open access, manually curated and peer-reviewed pathway database. The cornerstone of Reactome is a freely available, open source relational database of signaling and metabolic molecules and their relations organized into biological pathways and processes. The core unit of the Reactome data model is the reaction. Entities (nucleic acids, proteins, complexes, vaccines, anti-cancer therapeutics and small molecules) participating in reactions form a network of biological interactions and are grouped into pathways. Examples of biological pathways in Reactome include classical intermediary metabolism, signaling, transcriptional regulation, apoptosis and disease. The Reactome curation process for a pathway is similar to the editing of a scientific review. An external domain expert provides his or her expertise, a curator formalizes it into the database structure, and an external domain expert reviews the representation. A system of evidence tracking ensures that all assertions are backed up by the primary literature. Reactome pathway, reaction and molecules pages extensively cross-reference to over 100 different online bioinformatics resources, including NCBI Gene, Ensembl and UniProt databases, the UCSC Genome Browser, ChEBI small molecule databases, and the PubMed literature database.
SPECIES | PROTEINS | COMPLEXES | REACTIONS | PATHWAYS |
---|---|---|---|---|
S. pombe | 1690 | 1805 | 1486 | 819 |
S. cerevisiae | 1913 | 1827 | 1566 | 812 |
D. rerio | 8633 | 8452 | 7383 | 1676 |
X. tropicalis | 7046 | 7321 | 6159 | 1580 |
G. gallus | 7296 | 7931 | 6859 | 1706 |
S. scrofa | 8407 | 8825 | 7548 | 1660 |
B. taurus | 8841 | 9182 | 8048 | 1696 |
C. familiaris | 8162 | 8725 | 7455 | 1657 |
R. norvegicus | 8808 | 9505 | 8356 | 1702 |
M. musculus | 9537 | 10620 | 9456 | 1715 |
H. sapiens | 11097 | 14084 | 14398 | 2601 |
D. melanogaster | 4755 | 5402 | 4596 | 1477 |
C. elegans | 4468 | 4403 | 3700 | 1304 |
D. discoideum | 2681 | 2502 | 2313 | 982 |
P. falciparum | 1051 | 1007 | 861 | 599 |
Web server: Biocarta.
Introduction: The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System was designed to classify proteins (and their genes) in order to facilitate high-throughput analysis. The core of PANTHER is a comprehensive, annotated “library” of gene family phylogenetic trees. All nodes in the tree have persistent identifiers that are maintained between versions of PANTHER, providing a stable substrate for annotations of protein properties like subfamily and function. Each phylogenetic tree is used to annotate each protein member of the family by its: Family and Protein Class (supergrouping of protein families) Subfamily (subgroup within the family phylogenetic tree) Orthologs (genes in other organisms that derive from the same gene in the MRCA) Paralogs (genes in the same organism that are related by gene duplication) Function (using GO terms annotated on the trees by the GO Phylogenetic Annotation Project) Pathways (curated by PANTHER and by Reactome)
Type | Number |
---|---|
species | 143 |
pathways | 177 |
Ontologies | 3361 terms |
2267 biological process terms | |
544 cellular component terms | |
550 molecular function terms |
Web server: PANTHER.
Introduction: The BioCyc collection of Pathway/Genome Databases (PGDBs) provides electronic reference sources on the pathways and genomes of many organisms. BioCyc databases describe organisms with sequenced genomes. BioCyc is primarily microbial. In addition, BioCyc contains databases for humans; for important model organisms such as yeast, fly, and mouse; and for other eukaryotes whose PGDBs have been curated. BioCyc is a collection of 20,025 Pathway/Genome Databases (PGDBs) for model eukaryotes and for thousands of microbes, plus software tools for exploring them. BioCyc is an encyclopedic reference that contains curated data from 130,000 publications.
Curated Pathway/Genome Databases for many organisms have been created using our Pathway Tools software by a variety of institutions and are available from the following Web sites.
Database | Web site |
---|---|
EcoCyc | EcoCyc.org |
MetaCyc | MetaCyc.org |
HumanCyc | HumanCyc.org |
PlantCyc | PlantCyc.org |
GutCyc | GutCyc.org |
MouseCyc | mousecyc |
AraCyc | aracyc |
YeastCyc | yeast.biocyc.org |
LeishCyc | leishcyc |
Web server: BioCyc.
Introduction: INOH is a pathway database providing 857 pathways from model organisms including Drosophila, Homo sapiens, Mus musculus, and Rattus norvegicus. In INOH, the term pathway refers to higher order functional knowledge such as relationships among multiple bio-molecules that constitute signal transduction pathways or biological events in general.
Web server: INOH.
Introduction: EHMN (Edinburgh Human Metabolic Network) present a high-quality human metabolic network manually reconstructed by integrating genome annotation information from different databases and metabolic reaction information from literature. The network contains nearly 3000 metabolic reactions, which were reorganized into about 70 human-specific metabolic pathways according to their functional relationships. By analysis of the functional connectivity of the metabolites in the network, the bow-tie structure, which was found previously by structure analysis, is reconfirmed.
Web server: EHMN.
Introduction: WikiPathways is a database of biological pathways maintained by and for the scientific community. WikiPathways is an open, collaborative platform dedicated to the curation of biological pathways. WikiPathways thus presents a new model for pathway databases that enhances and complements ongoing efforts, such as KEGG, Reactome and Pathway Commons. Building on the same MediaWiki software that powers Wikipedia, a custom graphical pathway editing tool and integrated databases were added covering major gene, protein, and small-molecule systems.
Type | Number |
---|---|
species | 33 |
pathways | 3091 |
Installation: To install this package, start R (version “4.2”) and enter:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("rWikiPathways")
other installation version:
Web server: WikiPathways.
Introduction: The Pathway Interaction Database (PID) is a free biomedical database of human cellular signaling pathways. The database contains information about the molecular interactions and reactions that take place in cells, with a particular focus on processes that might be relevant to cancer research and treatment.
Type | Number |
---|---|
pathways | 254 |
Web server: PID.
Introduction: Pathway Commons provides biologists with (i) tools to search this comprehensive resource, (ii) a download site offering integrated bulk sets of pathway data (e.g. tables of interactions and gene sets), (iii) reusable software libraries for working with pathway information in several programming languages (Java, R, Python and Javascript) and (iv) a web service for programmatically querying the entire dataset. Visualization of pathways is supported using the Systems Biological Graphical Notation (SBGN). Pathway Commons currently contains data from 22 databases with 4794 detailed human biochemical processes (i.e. pathways) and ∼2.3 million interactions.
Web server: Pathway Commons.
Introduction: SMPDB (The Small Molecule Pathway Database) is an interactive, visual database containing more than 30 000 small molecule pathways found in humans only. The majority of these pathways are not found in any other pathway database. SMPDB is designed specifically to support pathway elucidation and pathway discovery in metabolomics, transcriptomics, proteomics and systems biology. It is able to do so, in part, by providing exquisitely detailed, fully searchable, hyperlinked diagrams of human metabolic pathways, metabolic disease pathways, metabolite signaling pathways and drug-action pathways. All SMPDB pathways include information on the relevant organs, subcellular compartments, protein_complex cofactors, protein_complex locations, metabolite locations, chemical structures and protein_complex quaternary structures. Each small molecule is hyperlinked to detailed descriptions contained in the HMDB or DrugBank and each protein_complex or enzyme complex is hyperlinked to UniProt. All SMPDB pathways are accompanied with detailed descriptions and references, providing an overview of the pathway, condition or processes depicted in each diagram. The database is easily browsed and supports full text, sequence and chemical structure searching. Users may query SMPDB with lists of metabolite names, drug names, genes/protein_complex names, SwissProt IDs, GenBank IDs, Affymetrix IDs or Agilent microarray IDs. These queries will produce lists of matching pathways and highlight the matching molecules on each of the pathway diagrams. Gene, metabolite and protein_complex concentration data can also be visualized through SMPDB’s mapping interface. All of SMPDB’s images, image maps, descriptions and tables are downloadable.
Type | Number |
---|---|
Total Pathways | 48690 |
Normal Metabolic Pathways | 27876 |
Drug Action Pathways | 404 |
Drug Metabolism Pathways | 64 |
Disease Pathways | 20251 |
Signaling Pathways | 24 |
Protein Pathways | 63 |
Physiological Pathways | 8 |
Drugs | 696 |
Metabolites | 55700 |
Proteins | 1451 |
Enzymes | 791 |
Transporters | 137 |
Reactions | 57402 |
Transportations | 294 |
Reaction-Coupled Transportations | 73 |
Interactions | 691 |
Web server: SMPDB.
Introduction:SignaLink is an integrated resource to analyze signaling pathway cross-talks, transcription factors, miRNAs and regulatory enzymes. Main features: A signaling network resource with known and predicted information for human and model organisms. Manually curated dataset of major signaling pathways including curated data from resources such as ACSN, InnateDB, Reactome and Signor. Extends pathways with integrated regulatory resources to contain pathway-specific. transcription factors, miRNA, scaffolds and post translational modifying enzymes. Proteins are classified by pathway position (core/non-core) and function (ligand, receptor, mediator, etc.). Signaling interactions are directed and labeled with PubMed IDs of the publications of experimental evidence. A multi-layered network structure allows the selection of user-specific details. Allows filtering based on tissue or sub-cellular localization. Supporting downloads in csv or psimi tab formats.
Web server: Signalink.
Introduction: NetPath is a manually curated resource of human signal transduction pathways. It is a joint effort between Pandey Lab at the Johns Hopkins University and the Institute of Bioinformatics (IOB), Bangalore, India, and is also worked on by other parties. NetPath hosts 45 signaling pathways, including 10 pathways with a major role in the regulation of immune system and 10 pathways with relevance to regulation of cancer.
Web server: NetPath.
Introduction: Integrated Pathway Resources, Analysis and Visualization System (iPAVS): provides a collection of highly-structured manually curated human pathway data, it also integrates biological pathway information from several public databases and provides several tools to manipulate,filter, browse, search, analyze, visualize and compare the integrated pathway resources.
Web server: iPAVS.
Introduction: PharmGKB pathways are evidence-based diagrams depicting the pharmacokinetics (PK) and/or pharmacodynamics (PD) of a drug with relevant (or potential) pharmacogenetic (PGx) associations. Drugs featured in PharmGKB pathways are chosen through extensive review of a variety of sources, including, but not limited to, the U.S. Food and Drug Administration (FDA) biomarker list and Clinical Pharmacogenetics Implementation Consortium (CPIC) nominations.
Web server: ParmGKB.
Introduction: PathCards is an integrated database of human biological pathways and their annotations. Human pathways were clustered into SuperPaths based on gene content similarity. Each PathCard provides information on one SuperPath which represents one or more human pathways. It includes 1570 SuperPath entries, consolidated from 11 sources.
Web server: PathCards.
Introduction: ACSN is a resource of cancer signalling knowledge, comprehensive map of molecular interactions in cancer based on the latest scientific literature.
Feature | Content |
---|---|
Maps of biological processes | 5 |
Functional modules | 52 |
Chemical species | 5975 |
Reactions | 4826 |
Proteins | 2371 |
Metabolites | 595 |
Genes | 159 |
References | 2919 |
Web server: ACSN.
Introduction: The NDEx Project provides an open-source framework where scientists and organizations can store, share, manipulate, and publish biological network knowledge. One of the goals of the project is to create a home for models that are currently available only as figures, tables, or supplementary information, such as networks produced via systematic mining and integration of large-scale molecular data. The NDEx project does not compete with existing pathway and interaction databases, such as Pathway Commons, KEGG, or Reactome; instead, NDEx provides a novel, common distribution channel for these efforts, preserving their identity and attribution rather than subsuming them.
Type | Number |
---|---|
Pathways | 972 (WikiPathways:675, signor_database:90 and NCIsysbio’s PID v2.0:207) |
IndraSysBio-assembled GO term networks | 6295 |
Gene sets | 32263 |
Installation:The NDEx app offers a cytoscape command for scripting purposes.
Web server: NDEx.
Introduction: Signor is a resource that annotates experimental evidence about causal interactions between proteins and other entities of biological relevance: stimuli, phenotypes, enzyme inhibitors, complexes, protein families etc. Each entry points to the experimental evidence supporting the interaction and is enriched by additional relevant metadata such as the effect of the interaction on the activity of the target entity, the molecular mechanism underlying this effect, etc… The curated data can be displayed as signed directed graphs by a graph drawing tool.
Web server: SIGNOR.
Introduction: Plant Reactome knowledgebase, a conceptual plant pathway network, is built by biocuration and integrating (bio)chemical entities, gene products, and macromolecular interactions. It provides manually curated pathways for the reference species Oryza sativa (rice) and gene orthology-based projections that extend pathway knowledge to 120 plant species. Currently, it hosts 30,000 reference pathways for plant metabolism, hormone signaling, transport, genetic regulation, plant organ development and differentiation, and biotic and abiotic stress responses. In addition to the pathway browsing and search functions, the Plant Reactome provides the analysis tools for pathway comparison between reference and projected species, pathway enrichment in gene expression data, and overlay of gene–gene interaction data on pathways. Web server: Plant Reactome.
Introduction: Human Autophagy Modulator Database (HAMdb, http://hamdb.scbdd.com), to provide researchers related pathway and disease information as many as possible. HAMdb contains 796 proteins, 841 chemicals and 132 microRNAs. Their specific effects on autophagy, physicochemical information, biological information and disease information were manually collected and compiled. Additionally, lots of external links were available for more information covering extensive biomedical knowledge.
Web server: HAMdb.
Introduction: An ecosystem that supports curation of pathway mappings between databases and fosters the exploration of pathway knowledge through several novel visualizations.ComPath can generate new biological insights by identifying pathway modules, clusters, and cross-talks with these mappings. Resources from KEGG, Reactome, WikiPathways, and MSigDB have been used to build this package and application.
Resource | Pathways | Genes |
---|---|---|
WikiPathways | 438 | 6015 |
Reactome | 2195 | 10633 |
KEGG | 330 | 7425 |
Installation:The NDEx app offers the Python and docker command for scripting purposes.
Web server: ComPath.
Introduction: BIOchemical PathwaY DataBase is developed as a manually curated, readily updatable, dynamic resource of human cell specific pathway information along with integrated computational platform to perform various pathway analyses. Presently, it comprises of 46 pathways, 3189 molecules, 5742 reactions and 6897 different types of diseases linked with pathway proteins, which are referred by 520 literatures and 17 other pathway databases.
Web server: BIOPYDB.
Introduction: A unifying interpretation of functional interaction between pathways by systematically quantifying coexpression between 1,330 canonical pathways from the Molecular Signatures Database (MSigDB) established the Pathway Coexpression Network (PCxN). A curated collection of 3,207 microarrays from 72 normal human tissues were estimated the correlation between canonical pathways valid in a broad context. PCxN accounts for shared genes between annotations to estimate significant correlations between pathways with related functions rather than with similar annotations. PCxN complements the results of gene set enrichment methods by revealing relationships between enriched pathways, and by identifying additional highly correlated pathways.
Installation: To install this package, start R (version “4.2”) and enter:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("pcxn")
Web server: PCxN.
Introduction: a Python package that transforms pathway knowledge from three major pathway databases (KEGG, reactome and wikiPathways) into a unified abstraction using Biological Expression Language as the pivotal, integrative schema. PathMe can generate new biological insights by identifying pathway modules and cross-talks similar to ComPath.
Installation:The PathMe app offers the Python command for scripting purposes.
Web server: PathMe.
Introduction: An extended pathway annotations and enrichment analysis resource for human, model organisms and domesticated species. Search Genes Search miRNAs Search Pathways API Download Statistics Documentation Publications Team Contact. pathDIP is an annotated database of signaling cascades in human and non-human organisms, comprising core pathways from major curated pathways databases, and pathways predicted based on orthology, and by using physical protein interactions. Data integration and predictions increase coverage of pathway annotations for human proteome to 92%. pathDIP annotates 122,131 unique proteins in 6,401 pathways in 17 organisms (including 18,454 human proteins), annotates 36,216 pathway orphans (including 5,366 human proteins), and provides multiple query, analysis and output options.
Web server: pathDIP.
Introduction: A comprehensive pathway database for model organisms. Quantitative metabolomics services for biomarker discovery and validation. Specializing in ready to use metabolomics kits. Your source for quantitative metabolomics technologies and bioinformatics. PathBank is an interactive, visual database containing more than 100 000 machine-readable pathways found in model organisms such as humans, mice, E. coli, yeast, and Arabidopsis thaliana. The majority of these pathways are not found in any other pathway database. All PathBank pathways include information on the relevant organelles, subcellular compartments, protein complex cofactors, protein complex locations, metabolite locations, chemical structures, and protein complex quaternary structures. Each small molecule is hyperlinked to detailed descriptions contained in the HMDB or DrugBank and each protein complex or enzyme complex is hyperlinked to UniProt. All PathBank pathways are accompanied with detailed descriptions and references, providing an overview of the pathway, condition, or processes depicted in each diagram.
Web server: PathBank.
Introduction: An Adverse Outcome Pathway (AOP) is a model that identifies a sequence of molecular and celluar events that may lead to adverse health effects in individuals and populations. An AOP maps out a sequence of biological events following an exposure that may result in illness or injury. By understanding the individual, key biological events in the organism, researchers can gain a better understanding of stressor-induced health outcomes. The Adverse Outcome Pathway Database (AOP-DB) is an online database that combines different data types (AOP, gene, chemical, disease, and pathway) to identify the impacts of chemicals on human health and the environment. EPA developed the AOP-DB to better characterize adverse outcomes of toxicological interest that are relevant to human health and the environment.
Biological Category | Data Source |
---|---|
Gene | NCBI Gene |
STRING | |
Taxonomy & Orthology | NCBI Taxonomy |
Homologene | |
KEGG Orthology | |
metaPhOrs | |
AOP | AOP-wiki |
Chemical | CTD |
AOP-wiki | |
ToxCast | |
Pathway | KEGG Pathway |
Reactome | |
ConcensusPathDB | |
Disease | DisGeNET |
Ontology | NCBI Gene |
Tissues | HumanBase |
Haplotypes | 1000 Genomes |
GTEx | |
Ensemble | |
GWAS Catalog |
Web server: AOP.
Introduction: A novel web application for network-based pathway analysis, based on the recently published ANUBIX algorithm which has been shown to be more accurate than previous network-based methods. The PathBIX website performs pathway annotation for 21 species, and utilizes prefetched and preprocessed network data from FunCoup 5.0 networks and pathway data from three databases: KEGG, Reactome, and WikiPathways.
Web server: PathBIX.