项目作者: kkdey

项目描述 :
R package for Enrichment Depletion Logos (EDLogos) and String Logos
高级语言: R
项目地址: git://github.com/kkdey/Logolas.git
创建时间: 2016-10-07T04:33:22Z
项目社区:https://github.com/kkdey/Logolas

开源协议:GNU General Public License v3.0

下载


Logolas

Logolas is an R package for Enrichment Depletion Logo plots with
string symbols, that highlights both enrichment and depletion of symbols, as opposed
to standard logo plots, as in seqLogo package,
that are biased towards highlighting enrichments. Logolas also generalizes logo
plots to use both characters and strings.

If you find a bug, please create an
issue.

This code has been tested in …

misc

License

Copyright (c) 2018-2019, Kushal Dey.

All source code and software in this repository are made available
under the terms of the GNU General Public
License
. See the
LICENSE file for the full text of the license.

Citing this work

If you find that this R package is useful for your work, please cite
our paper which is out on BMC Bioinformatics:

Dey, K.K., Xie, D. and Stephens, M., 2018. A new sequence logo plot
to highlight enrichment and depletion. BMC Bioinformatics. 19:473
https://doi.org/10.1186/s12859-018-2489-3.

Quick Start

The most recent version of Logolas is available from Github using devtools R package.First, you would
require to install the following Bioconductor packages.

  1. source("https://bioconductor.org/biocLite.R")
  2. biocLite(c("Biostrings","BiocStyle","Biobase","seqLogo","ggseqlogo"))

Then install Logolas as follows

  1. library(devtools)
  2. install_github("kkdey/Logolas",build_vignettes = TRUE)

Once you have installed the package, load the package in R by entering

  1. library(Logolas)

To get an overview of the package, enter

  1. help(package = "Logolas")

Next, try creating a few plots using the logomaker function:

Create a standard Logo plot in Logolas, analogous to seqLogo and
ggseqLogo R packages.

  1. sequence <- c("CTATTGT","CTCTTAT","CTATTAA","CTATTTA", "CTATTAT","CTTGAAT",
  2. "CTTAGAT","CTATTAA","CTATTTA","CTATTAT", "CTTTTAT","CTATAGT",
  3. "CTATTTT","CTTATAT","CTATATT","CTCATTT", "CTTATTT","CAATAGT",
  4. "CATTTGA","CTCTTAT","CTATTAT","CTTTTAT", "CTATAAT","CTTAGGT",
  5. "CTATTGT","CTCATGT","CTATAGT", "CTCGTTA","CTAGAAT","CAATGGT")
  6. logomaker(sequence,type = "Logo")

misc

The corresponding EDLogo plot highlights the depletion of T in the middle, not
visually clear in the standard logo plot.

  1. logomaker(sequence, type = "EDLogo")

misc

One can also apply EDLogo for amino acid motifs, marked by alphabets beyond A, C, G and T as in
DNA motifs.

We create an EDLogo plot on the amino acid sequences at N-Glycosylation sites, with a user specified
background bg chosen to be the median psoitional weight of an aminoa acid in the context around the
glycosylation site [data from Uniprotkb].

  1. data("N_Glycosyl_sequences")
  2. bg <- apply(N_Glycosyl_sequences, 1, function(x) return(median(x)))
  3. bg <- bg/sum(bg)
  4. logomaker(N_Glycosyl_sequences, type = "EDLogo", bg=bg)

misc

EDLogo highlights the motif Asn (N) -X- Ser (S)/Thr (T) -X motif at the center where X is depleted for the amino acid Pro (P).

Logolas allows the symbols in the logo plot to be a combination of strings and charcaters or be purely strings - examples of which are shown below

For a mutation signature (mismatch type at the center with flanking bases) example (data from Shiraishi et al 2015).

  1. data(mutation_sig)
  2. logomaker(mutation_sig, type = "EDLogo", color_type = "per_symbol", color_seed = 2000)

misc

EDLogo plot for the enrichment and depletion of histone marks in different parts of the genome (data from Koch et al 2007).

  1. data(histone_marks)
  2. logomaker(histone_marks$mat, bg = histone_marks$bgmat, type = "EDLogo")

misc

Finally, please walk through some more detailed examples in the
vignette:

  1. vignette("Logolas")

Developer notes

This was the R command used to generate the vignette PDF file from the
R Markdown source:

  1. render("Logolas.Rmd",output_format="pdf_document")

Credits

This software was developed by Kushal Dey,
Dongyue Xie and
Matthew Stephens at the University
of Chicago. For any questions or comments, please contact Kushal Dey
at kkdey@uchicago.edu"">kkdey@uchicago.edu.

The authors would like to acknowledge Oliver Bembom, the author of the
seqLogo package which acted as an inspiration and starting point for this
software. The authors also thank Peter Carbonetto, Edward Wallace and John Blischak
for helpful discussions and feedback.