项目作者: kkrismer

项目描述 :
Irreproducible discovery rate for genomic interaction data
高级语言: R
项目地址: git://github.com/kkrismer/idr2d.git
创建时间: 2019-01-22T15:00:12Z
项目社区:https://github.com/kkrismer/idr2d

开源协议:Other

下载


IDR2D: Irreproducible Discovery Rate for Genomic Interactions

license: MIT DOI BioC platforms Coverage Status

https://idr2d.mit.edu

Chromatin interaction data from protocols such as ChIA-PET and HiChIP provide valuable insights into genome organization and gene regulation, but can include spurious interactions that do not reflect underlying genome biology. We introduce a generalization of the Irreproducible Discovery Rate (IDR) method called IDR2D that identifies replicable interactions shared by experiments. IDR2D provides a principled set of interactions and eliminates artifacts from single experiments.

Installation

The idr2d package is part of Bioconductor since release 3.10. To install it on your system, enter:

  1. if (!requireNamespace("BiocManager", quietly = TRUE)) {
  2. install.packages("BiocManager")
  3. }
  4. BiocManager::install("idr2d")

Alternatively, the development version can be installed directly from this repository:

  1. if (!requireNamespace("remotes", quietly = TRUE)) {
  2. install.packages("remotes")
  3. }
  4. remotes::install_github("kkrismer/idr2d")

R 3.6 (or higher) and Bioconductor 3.10 (or higher) is required in both cases. Additionally, the 64-bit version of Python 3.5 (or higher) and the Python package hic-straw are required for Hi-C analysis from Juicer .hic files.

Usage

There are two vignettes available on Bioconductor, focusing on idr2d and ChIA-PET data and idr2d and ChIP-seq data.

The reference manual might also be helpful if you know what you are looking for.

Example code for ChiP-seq, ChIA-PET and Hi-C experiments

Analyzing results from replicate ChIP-seq experiments
(stored in tab-delimited files chip-seq-rep1.txt and chip-seq-rep2.txt):

  1. library(idr2d)
  2. rep1_df <- read.table("chip-seq-rep1.txt", header = TRUE, sep = "\t",
  3. stringsAsFactors = FALSE)
  4. rep2_df <- read.table("chip-seq-rep2.txt", header = TRUE, sep = "\t",
  5. stringsAsFactors = FALSE)
  6. idr_results <- estimate_idr1d(rep1_df, rep2_df,
  7. value_transformation = "identity")
  8. summary(idr_results)
  9. rep1_idr_df <- idr_results$rep1_df
  10. draw_idr_distribution_histogram(rep1_idr_df)
  11. draw_rank_idr_scatterplot(rep1_idr_df)
  12. draw_value_idr_scatterplot(rep1_idr_df)

Analyzing results from replicate ChIA-PET experiments
(stored in tab-delimited files chia-pet-rep1.txt and chia-pet-rep2.txt):

  1. library(idr2d)
  2. rep1_df <- read.table("chia-pet-rep1.txt", header = TRUE, sep = "\t",
  3. stringsAsFactors = FALSE)
  4. rep2_df <- read.table("chia-pet-rep2.txt", header = TRUE, sep = "\t",
  5. stringsAsFactors = FALSE)
  6. idr_results <- estimate_idr2d(rep1_df, rep2_df,
  7. value_transformation = "identity")
  8. summary(idr_results)
  9. rep1_idr_df <- idr_results$rep1_df
  10. draw_idr_distribution_histogram(rep1_idr_df)
  11. draw_rank_idr_scatterplot(rep1_idr_df)
  12. draw_value_idr_scatterplot(rep1_idr_df)

Analyzing chromosome 1 results in 1 Mbp resolution from replicate Hi-C experiments
(stored in Juicer .hic files hic-rep1.hic and hic-rep2.hic):

  1. library(idr2d)
  2. rep1_df <- parse_juicer_matrix("hic-rep1.hic", resolution = 1e+06, chromosome = "chr1")
  3. rep2_df <- parse_juicer_matrix("hic-rep2.hic", resolution = 1e+06, chromosome = "chr1")
  4. idr_results_df <- estimate_idr2d_hic(rep1_df, rep2_df)
  5. summary(idr_results_df)
  6. draw_idr_distribution_histogram(idr_results_df)
  7. draw_rank_idr_scatterplot(idr_results_df)
  8. draw_value_idr_scatterplot(idr_results_df)
  9. draw_hic_contact_map(idr_results_df, idr_cutoff = 0.05, chromosome = "chr1")

Analyzing chromosome 1 results in 1 Mbp resolution from replicate Hi-C experiments
(stored in ICE normalized HiC-Pro .matrix and .bed files rep1_1000000_iced.matrix, rep1_1000000_abs.bed and rep2_1000000_iced.matrix, rep2_1000000_abs.bed):

  1. library(idr2d)
  2. rep1_df <- parse_hic_pro_matrix("rep1_1000000_iced.matrix", "rep1_1000000_abs.bed", chromosome = "chr1")
  3. rep2_df <- parse_hic_pro_matrix("rep2_1000000_iced.matrix", "rep2_1000000_abs.bed", chromosome = "chr1")
  4. idr_results_df <- estimate_idr2d_hic(rep1_df, rep2_df)
  5. summary(idr_results_df)
  6. draw_idr_distribution_histogram(idr_results_df)
  7. draw_rank_idr_scatterplot(idr_results_df)
  8. draw_value_idr_scatterplot(idr_results_df)
  9. draw_hic_contact_map(idr_results, idr_cutoff = 0.05, chromosome = "chr1")

Build status

Platform Status
Travis CI Travis build status
Bioconductor 3.18 (release) BioC release
Bioconductor 3.19 (devel) BioC devel

Citation

If you use IDR2D in your research, please cite:

IDR2D identifies reproducible genomic interactions
Konstantin Krismer, Yuchun Guo, and David K. Gifford
Nucleic Acids Research, Volume 48, Issue 6, 06 April 2020, Page e31; DOI: https://doi.org/10.1093/nar/gkaa030

Funding

The development of this method was supported by National Institutes of Health (NIH) grants 1R01HG008363 and 1R01NS078097, and the MIT Presidential Fellowship.