ABOUT | MYSQL | MORE
dcGOdb_202301.sql.gz contains 9 tables, with the schema outlined below:
|
interpro_info
A table about InterPro family domains Table_interpro_info.txt.gz
mysql> DESC interpro_info;
+-------------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------------+------+-----+---------+-------+
| id | varchar(20) | NO | PRI | NULL | |
| description | varchar(255) | NO | | NULL | |
| level | enum('interpro') | NO | MUL | NULL | |
+-------------+------------------+------+-----+---------+-------+
- The id column is the InterPro domain ID
- The description column is the description for InterPro family domains
- The level column is the domain level ('interpro')
mapping_interpro
A table about ontology annotations for InterPro family domains (that is, term-domain mappings) Table_mapping_interpro.txt.gz
mysql> DESC mapping_interpro;
+-----------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+----------------+
| obo | char(8) | NO | PRI | NULL | |
| term_id | varchar(50) | NO | PRI | NULL | |
| domain_id | varchar(20) | NO | PRI | NULL | |
| ascore | double | YES | | NULL | |
| auto | int | NO | UNI | NULL | auto_increment |
+-----------+-------------+------+-----+---------+----------------+
- The obo column is the ontology
- The term_id column is the ontology term ID
- The domain_id column is the InterPro domain ID
- The ascore column is the annotation score [1-100]
- The auto column is an increasing sequence of unique numbers, collectively acting as a primary key
mapping_pfam
A table about ontology annotations for Pfam family domains (that is, term-domain mappings) Table_mapping_pfam.txt.gz
mysql> DESC mapping_pfam;
+-----------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+----------------+
| obo | char(8) | NO | PRI | NULL | |
| term_id | varchar(50) | NO | PRI | NULL | |
| domain_id | varchar(20) | NO | PRI | NULL | |
| ascore | double | YES | | NULL | |
| auto | int | NO | UNI | NULL | auto_increment |
+-----------+-------------+------+-----+---------+----------------+
- The obo column is the ontology
- The term_id column is the ontology term ID
- The domain_id column is the Pfam domain ID
- The ascore column is the annotation score [1-100]
- The auto column is an increasing sequence of unique numbers, collectively acting as a primary key
mapping_scop
A table about ontology annotations for SCOP superfamily and family domains (that is, term-domain mappings) Table_mapping_scop.txt.gz
mysql> DESC mapping_scop;
+-----------+--------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+--------------------+------+-----+---------+----------------+
| obo | char(8) | NO | PRI | NULL | |
| term_id | varchar(50) | NO | PRI | NULL | |
| domain_id | mediumint unsigned | NO | PRI | NULL | |
| ascore | double | YES | | NULL | |
| auto | int | NO | UNI | NULL | auto_increment |
+-----------+--------------------+------+-----+---------+----------------+
- The obo column is the ontology
- The term_id column is the ontology term ID
- The domain_id column is the SCOP domain ID
- The ascore column is the annotation score [1-100]
- The auto column is an increasing sequence of unique numbers, collectively acting as a primary key
pfam_info
A table about Pfam family domains Table_pfam_info.txt.gz
mysql> DESC pfam_info;
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| id | varchar(20) | NO | PRI | NULL | |
| description | varchar(255) | NO | | NULL | |
| level | enum('pfam') | NO | MUL | NULL | |
+-------------+--------------+------+-----+---------+-------+
- The id column is the Pfam domain ID
- The description column is the description for Pfam family domains
- The level column is the domain level ('pfam')
scop_hie
A table about SCOP hierarchy Table_scop_hie.txt.gz
mysql> DESC scop_hie;
+--------+--------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+--------------------+------+-----+---------+----------------+
| parent | mediumint unsigned | NO | PRI | NULL | |
| child | mediumint unsigned | NO | PRI | NULL | |
| auto | int | NO | UNI | NULL | auto_increment |
+--------+--------------------+------+-----+---------+----------------+
- The parent column is the parent SCOP domain ID (i.e. superfamily)
- The child column is the children SCOP domain ID (i.e. family)
- The auto column is an increasing sequence of unique numbers, collectively acting as a primary key
scop_info
A table about SCOP superfamily and family domains Table_scop_info.txt.gz
mysql> DESC scop_info;
+-------------+--------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------------+------+-----+---------+-------+
| id | mediumint unsigned | NO | PRI | NULL | |
| description | varchar(255) | NO | | NULL | |
| level | enum('sf','fa') | NO | MUL | NULL | |
+-------------+--------------------+------+-----+---------+-------+
- The id column is the SCOP domain ID
- The description column is the description for SCOP domains
- The level column is the domain level (i.e., either 'sf' for superfamily or 'fa' for family)
term_hie
A table about ontology hierarchy Table_term_hie.txt.gz
mysql> DESC term_hie;
+----------+------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+------------------------+------+-----+---------+----------------+
| obo | char(8) | NO | PRI | NULL | |
| parent | varchar(20) | NO | PRI | NULL | |
| child | varchar(20) | NO | PRI | NULL | |
| relation | enum('is_a','part_of') | NO | | NULL | |
| root | enum('yes','no') | NO | | NULL | |
| auto | int | NO | UNI | NULL | auto_increment |
+----------+------------------------+------+-----+---------+----------------+
- The obo column is the ontology
- The parent column is the parent ontology term ID
- The child column is the children ontology term ID
- The relation column is the relation between terms (i.e., either 'is_a' or 'part_of')
- The root column is to whether or not the parent terms are at root (i.e., either 'yes' or 'no')
- The auto column is an increasing sequence of unique numbers, collectively acting as a primary key
term_info
A table about ontology terms Table_term_info.txt.gz
mysql> DESC term_info;
+----------+--------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+--------------------+------+-----+---------+----------------+
| id | varchar(50) | NO | MUL | NULL | |
| name | varchar(255) | NO | PRI | NULL | |
| obo | char(8) | NO | PRI | NULL | |
| distance | mediumint unsigned | NO | | NULL | |
| auto | int | NO | UNI | NULL | auto_increment |
+----------+--------------------+------+-----+---------+----------------+
- The id column is the ontology term ID
- The name column is the ontology term name
- The obo column is the ontology
- The distance column is the distance to the root term
- The auto column is an increasing sequence of unique numbers, collectively acting as a primary key
USE
Restore the dcGO MySQL database
mysql -u root -p -e "create database dcGOdb;" gunzip < dcGOdb_202301.sql.gz | mysql -u root -p dcGOdb
Show tables
mysql> USE dcGOdb; mysql> SHOW tables; +------------------+ | interpro_info | | mapping_interpro | | mapping_pfam | | mapping_scop | | pfam_info | | scop_hie | | scop_info | | term_hie | | term_info | +------------------+
Extract Pfam domains annotated by Experimental Factor Ontology (EFO) term 'EFO:0000540' ('immune system disease')
mysql> SELECT b.id, b.description FROM mapping_pfam AS a, pfam_info AS b WHERE a.domain_id=b.id AND a.obo='EFO' AND a.term_id='EFO:0000540' ORDER BY b.description ASC; +---------+-----------------------------------------------------------+ | id | description | +---------+-----------------------------------------------------------+ | PF00001 | 7 transmembrane receptor (rhodopsin family) | | PF00023 | Ankyrin repeat | | PF07716 | Basic region leucine zipper | | PF00170 | bZIP transcription factor | | PF00656 | Caspase domain | | PF00619 | Caspase recruitment domain | | PF00129 | Class I Histocompatibility antigen, domains alpha 1 and 2 | | PF00993 | Class II histocompatibility antigen, alpha domain | | PF00969 | Class II histocompatibility antigen, beta domain | | PF00173 | Cytochrome b5-like Heme/Steroid binding domain | | PF00008 | EGF-like domain | | PF00178 | Ets-domain | | PF07654 | Immunoglobulin C1-set domain | | PF07686 | Immunoglobulin V-set domain | | PF03770 | Inositol polyphosphate kinase | | PF00605 | Interferon regulatory factor transcription factor | | PF09294 | Interferon-alpha/beta receptor, fibronectin type III | | PF10401 | Interferon-regulatory factor 3 | | PF00169 | PH domain | | PF00130 | Phorbol esters/diacylglycerol binding domain (C1 domain) | | PF07714 | Protein tyrosine and serine/threonine kinase | | PF01023 | S-100/ICaBP type calcium binding domain | | PF00017 | SH2 domain | | PF00018 | SH3 domain | | PF00048 | Small cytokines (intecrine/chemokine), interleukin-8 like | | PF01017 | STAT protein, all-alpha domain | | PF02864 | STAT protein, DNA binding domain | | PF02865 | STAT protein, protein interaction domain | | PF02198 | Sterile alpha motif (SAM)/Pointed domain | | PF01582 | TIR domain | | PF01108 | Tissue factor | | PF00229 | TNF(Tumour Necrosis Factor) family | | PF00020 | TNFR/NGFR cysteine-rich region | +---------+-----------------------------------------------------------+
VERSIONS
January 2023 release dcGOdb_202301.sql.gz
July 2022 release dcGOdb_202207.sql.gz
September 2021 release dcGOdb_202109.sql.gz
April 2021 release dcGOdb_202104.sql.gz
Hai FANG, PhD
Professor of Bioinformatics
Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine at Shanghai, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China
|