项目作者: slowkow

项目描述 :
:dart: Human transcription factor target genes.
高级语言: R
项目地址: git://github.com/slowkow/tftargets.git
创建时间: 2015-03-02T16:55:17Z
项目社区:https://github.com/slowkow/tftargets

开源协议:

下载


tftargets

Transcription factors (TFs) activate and repress target genes. This R package
provides easy access to query a particular TF and find its targets in humans.
The data has been collected from multiple different databases.

Credit: © KENNETH EWARD/BIOGRAFX/PHOTO RESEARCHERS, INC

Citation

For now, please provide a link to this github repository:

https://github.com/slowkow/tftargets

Usage

You may install this package with devtools:

  1. devtools::install_github("slowkow/tftargets")
  2. library(tftargets)
  3. length(TRED)
  4. # [1] 133

Alternatively, you can download just the RData file:

  1. # Download the file:
  2. # install.packages("RCurl")
  3. library(RCurl)
  4. download.file(
  5. url = "https://raw.githubusercontent.com/slowkow/tftargets/master/data/tftargets.rda",
  6. destfile = "tftargets.rda",
  7. method = "curl"
  8. )
  9. # Load the file:
  10. load("tftargets.rda")
  11. # View the variables stored in the file:
  12. ls()
  13. [1] "ENCODE" "ITFP" "Marbach2016"
  14. [4] "Neph2012" "TRED" "TRRUST"

Data

This package contains the following datasets:

Dataset Structure Gene Identifier # TF’s # TF-gene assocations Reference
TRED list ENTREZ 133 7,066 TRED (2007)
ITFP list HGNC Symbol/Alias 1974 67,154 ITFP (2008)
ENCODE list ENTREZ 157 20,428 ENCODE (2012)
Neph2012 nested list* HGNC Symbol/Alias 536 16,484 Neph2012 (2012)
TRRUST list HGNC Symbol/Alias 748 8,215 TRRUST (2015)
Marbach2016 list HGNC Symbol/Alias 643 1,305,782 Marbach2016 (2016)

* Note: The Neph2012 is organized as a nested list where the top-level keys
refer to tissue types (e.g. “fBrain-DS11872”).

See data-raw/make_rdata.R for the script that converts the raw
data into lists of gene sets.


TRED

Citation

Jiang, C., Xuan, Z., Zhao, F. & Zhang, M. Q. TRED: a transcriptional
regulatory element database, new entries and other development. Nucleic
Acids Res. 35, D137–40 (2007).
PubMed

Source

https://cb.utdallas.edu/cgi-bin/TRED/tred.cgi?process=home

Description

Predicted and known human transcription factor targets.

Here we find that TRED claims 59 genes are targeted by STAT3.

  1. # Entrez Gene IDs.
  2. TRED[["STAT3"]]
  3. [1] 2 332 355 595 596 598 896 943 958 1026 1051
  4. [12] 1401 1588 1962 2194 2209 2353 3082 3162 3320 3326 3479
  5. [23] 3559 3572 3586 3659 3718 3725 3929 4170 4582 4585 4609
  6. [34] 4843 5008 5021 5292 5551 5967 6095 6347 6654 7076 7078
  7. [45] 7097 7124 7200 7422 7432 8651 8996 9021 11336 23514 26229
  8. [56] 27151 55893 117153 201254

Figures


ITFP

Citation

Zheng, G., Tu, K., Yang, Q., Xiong, Y., Wei, C., Xie, L., Zhu, Y. & Li, Y.
ITFP: an integrated platform of mammalian transcription factors.
Bioinformatics 24, 2416–2417 (2008).
PubMed

Source

http://itfp.biosino.org/itfp

Description

Predicted human transcription factor targets.

  1. # Gene symbols used on the ITFP website.
  2. ITFP[["STAT3"]]
  3. [1] "FIGNL1" "NCOR1" "SUV420H1"

Figures


ENCODE

Citation

ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the
human genome. Nature 489, 57–74 (2012).
PubMed

Source

http://hgdownload.cse.ucsc.edu/goldenpath/hg19/encodeDCC/wgEncodeRegTfbsClustered

Description

Putative human transcription factor targets based on ChIP-seq data from
the Encyclopedia of DNA Elements (ENCODE) Project.

  1. # Entrez Gene IDs.
  2. head(ENCODE[["STAT3"]], 100)
  3. [1] 23 31 35 40 81 90 93 98 100 104 105 111 114 118 119 135 147 150 159
  4. [20] 160 161 174 178 210 224 238 257 259 267 272 273 286 287 307 313 320 321 323
  5. [39] 328 333 351 368 369 378 402 408 412 419 421 432 444 463 467 472 473 482 491
  6. [58] 495 529 534 550 571 577 581 586 593 596 597 598 602 622 627 631 636 637 640
  7. [77] 651 658 667 669 687 694 695 714 740 752 753 770 773 779 780 781 783 788 800
  8. [96] 805 811 814 817 821

Figures


Neph2012

Citation

Neph, S., Stergachis, A. B., Reynolds, A., Sandstrom, R., Borenstein, E.
& Stamatoyannopoulos, J. A. Circuitry and dynamics of human transcription
factor regulatory networks. Cell 150, 1274–1286 (2012).
PubMed

Source

http://www.regulatorynetworks.org

Description

Transcription factor targets discovered by DNaseI footprinting and TF
recognition sequences. Targets include only transcription factors and not
other genes.

  1. # Entrez Gene IDs.
  2. Neph2012[["AG10803-DS12374"]][["STAT3"]]
  3. [1] 466 1386 467 468 22809 22926 11016 1385 9586 1390 10664
  4. [12] 1958 1959 1960 1961 2735 2736 2737 148979 2969 8462 9314
  5. [23] 4149 4150 4609 4800 4801 4802 2494 5076 5080 5453 5454
  6. [34] 6667 6668 6670 6671 6774 7020 7021 7022 29842 7490 7494
  7. [45] 51043 7707 10127

Raw Data

  1. zcat data-raw/Neph2012/human_2013-09-16/AG10803-DS12374/genes.regulate.genes.bz2 | head
  2. AHR BHLHE41
  3. AHR CNOT3
  4. AHR CREB1
  5. AHR CREB5
  6. AHR CTCF
  7. AHR EGR1
  8. AHR EGR2
  9. AHR EGR3
  10. AHR EGR4
  11. AHR EPAS1

Figures


TRRUST

Citation

Han, H., Shim, H., Shin, D., Shim, J. E., Ko, Y., Shin, J., Kim, H., Cho,
A., Kim, E., Lee, T., Kim, H., Kim, K., Yang, S., Bae, D., Yun, A., Kim, S.,
Kim, C. Y., Cho, H. J., Kang, B., Shin, S. & Lee, I. TRRUST: a reference
database of human transcriptional regulatory interactions. Sci. Rep. 5,
11432 (2015).
PubMed

Source

http://www.grnpedia.org/trrust

Description

TRRUST is a manually curated database of human transcriptional regulatory
network.

Current version of TRRUST contains 8,015 transcriptional regulatory
relationships between 748 human transcription factors (TFs) and 1,975 non-TF
genes, derived from 6,175 pubmed articles, which describe small-scale
experimental studies of transcriptional regulations. To efficiently search
for regulatory relationships from over 20 million pubmed articles, we used
sentence-based text mining approach.

TRRUST database also provide information of mode of regulation (activation
or repression). Currently 4,861 (60.6%) regulatory relationships are known
for mode of regulation.

  1. head(TRRUST[["STAT3"]], 100)
  2. [1] "A2M" "AKAP12" "AKT1" "BCL2" "BCL2" "BCL2L1" "BCL2L1" "BCL6" "BIRC5" "BST2" "CCL11" "CCL20"
  3. [13] "CCND1" "CCND1" "CCND2" "CCND3" "CD46" "CDH1" "CDK4" "CDKN1A" "CDKN1B" "CFB" "CFLAR" "CHI3L1"
  4. [25] "CISH" "COPS5" "CRP" "CSRP1" "CTGF" "CXCL8" "CYP19A1" "CYR61" "DDIT3" "DNMT1" "EGFR" "ESR2"
  5. [37] "ETV6" "F2R" "FAAH" "FAS" "FAS" "FGF1" "FGF2" "FGG" "FGL1" "FLT3" "FOS" "GAST"
  6. [49] "GFAP" "HAMP" "HGF" "HIF1A" "HMOX1" "HP" "HSPA4" "HSPB1" "ICAM1" "IFNAR1" "IFNG" "IKBKE"
  7. [61] "IL10" "IL11" "IL1RN" "IL2" "IL21" "IL2RA" "IL6" "IL6" "IRF1" "JAK2" "JAK3" "JUNB"
  8. [73] "KLF11" "KRT17" "LCAT" "LEP" "LGALS3BP" "LTBP1" "MCL1" "MCL1" "MDC1" "MICA" "MMP1" "MMP14"
  9. [85] "MMP2" "MMP2" "MMP3" "MMP7" "MMP7" "MMP9" "MMP9" "MUC1" "MUC4" "MYC" "MYC" "NANOG"
  10. [97] "NDUFA13" "NME1" "NOSTRIN" "NOX5"

Raw Data

  1. zcat data-raw/TRRUST/trrust_rawdata.txt.gz | head | column -t
  2. AATF BAK1 Unknown 22983126
  3. AATF BAX Repression 22909821
  4. AATF BBC3 Unknown 22983126
  5. AATF CDKN1A Unknown 17157788
  6. AATF MYC Activation 20549547
  7. AATF TP53 Unknown 17157788
  8. ABL1 BAX Activation 11753601
  9. ABL1 BCL2 Repression 11753601
  10. ABL1 BCL6 Repression 15509806
  11. ABL1 CCND2 Activation 15509806

Figures


Marbach2016

Citation

Marbach, D., Lamparter, D., Quon, G., Kellis, M., Kutalik, Z. & Bergmann, S.
Tissue-specific regulatory circuits reveal variable modular perturbations
across complex diseases. Nat. Methods 13, 366–370 (2016).
PubMed

Source

http://regulatorycircuits.org

Description

We developed a comprehensive resource of close to 400 cell type- and
tissue-specific gene regulatory networks for human. Our study shows that
disease-associated genetic variants often perturb regulatory modules in cell
types or tissues that are highly specific to that disease.

  1. head(Marbach2016[["STAT3"]], 100)
  2. [1] "SURF1" "ZNF230" "EIF5" "ATG4C" "LYSMD4" "ZWILCH" "TFB1M" "SLC12A7" "DNAL1" "PPP1R8" "SEPT9" "SDCCAG8"
  3. [13] "CMTR1" "GSAP" "PPIA" "CLCN6" "ZFP69" "ZFP64" "RNPC3" "BRPF1" "ZKSCAN5" "ZNF410" "ASF1B" "PES1"
  4. [25] "TMEM41B" "F2RL1" "DARS" "ZNF24" "RPL4" "SYF2" "AGTPBP1" "NANOS1" "ZNF140" "SEC14L1" "CHAC1" "CDC42SE2"
  5. [37] "LIPG" "PROS1" "MIIP" "DENND1A" "ADAMTSL2" "TBC1D22B" "PHACTR4" "TNFAIP2" "SLC35C1" "ZNF284" "NCCRP1" "ZFYVE16"
  6. [49] "TBL1XR1" "UNC45A" "TIMM50" "PRRT1" "RNF215" "PAF1" "SPINT1" "RABL2B" "DMWD" "RIN3" "PAK2" "NOTCH4"
  7. [61] "INPP5F" "PSMA8" "MX2" "TBC1D7" "CCDC135" "ATP2B4" "HLA-DQA2" "IPO8" "EID2B" "OGDH" "ZFYVE21" "DDB1"
  8. [73] "SEC31A" "SURF6" "EXD2" "KIF3A" "RPUSD3" "SYMPK" "ASB13" "CASC5" "RLF" "LIN54" "TNXB" "TRABD"
  9. [85] "PHTF2" "COPS4" "FAM32A" "PDLIM4" "CPSF7" "ZNF720" "RBFOX2" "COA4" "ATP10A" "MTMR1" "TNRC6C" "TMED4"
  10. [97] "BUD31" "GADD45B" "MTMR3" "CDC42EP4"

Raw Data

Columns:

  1. Transcription factor.
  2. Target gene.
  3. Edge weight.
  1. zcat data-raw/regulatorycircuits/FANTOM5_individual_networks/394_individual_networks/synoviocyte.txt.gz | head
  2. RAX PPP2R2A 1.79016453E-3
  3. MYCN RHOA 1.81311653E-2
  4. TFAP2 RRM1 7.13096624E-3
  5. PRDM4 KPNA2 1.61069158E-2
  6. FOXB1 SCARF2 1.78696733E-3
  7. ATF4 NDUFA11 1.53625527E-3
  8. SPIC C9orf69 8.60099271E-4
  9. FLI1 CENPU 6.72504942E-3
  10. HNF4A LHFPL2 1.47391413E-2
  11. STAT3 SURF1 3.14614561E-3

Figures