项目作者: Zsailer

项目描述 :
Pandas DataFrames for phylogenetics
高级语言: Python
项目地址: git://github.com/Zsailer/phylopandas.git
创建时间: 2017-10-24T19:38:59Z
项目社区:https://github.com/Zsailer/phylopandas

开源协议:BSD 3-Clause "New" or "Revised" License

下载


Gitter chat
Documentation Status
Build Status
Binder

Bringing the Pandas DataFrame to phylogenetics.

PhyloPandas provides a Pandas-like interface for reading sequence and phylogenetic tree data into pandas DataFrames. This enables easy manipulation of phylogenetic data using familiar Python/Pandas functions. Finally, phylogenetics for humans!

How does it work?

Don’t worry, we didn’t reinvent the wheel. PhyloPandas is simply a DataFrame
(great for human-accessible data storage) interface on top of Biopython (great for parsing/writing sequence data) and DendroPy (great for reading tree data).

PhyloPandas does two things:

  1. It offers new read functions to read sequence/tree data directly into a DataFrame.
  2. It attaches a new phylo accessor to the Pandas DataFrame. This accessor provides writing methods for sequencing/tree data (powered by Biopython and dendropy).

Basic Usage

Sequence data:

Read in a sequence file.

  1. import phylopandas as ph
  2. df1 = ph.read_fasta('sequences.fasta')
  3. df2 = ph.read_phylip('sequences.phy')

Write to various sequence file formats.

  1. df1.phylo.to_clustal('sequences.clustal')

Convert between formats.

  1. # Read a format.
  2. df = ph.read_fasta('sequences.fasta')
  3. # Write to a different format.
  4. df.phylo.to_phylip('sequences.phy')

Tree data:

Read newick tree data

  1. df = ph.read_newick('tree.newick')

Visualize the phylogenetic data (powered by phylovega).

  1. df.phylo.display(
  2. height=500,
  3. )

Contributing

If you have ideas for the project, please share them on the project’s Gitter chat.

It’s easy to create new read/write functions and methods for PhyloPandas. If you
have a format you’d like to add, please submit PRs! There are many more formats
in Biopython that I haven’t had the time to add myself, so please don’t be afraid
to add them! I thank you ahead of time!

Testing

PhyloPandas includes a small pytest suite. Run these tests from base directory.

  1. $ cd phylopandas
  2. $ pytest

Install

Install from PyPI:

  1. pip install phylopandas

Install from source:

  1. git clone https://github.com/Zsailer/phylopandas
  2. cd phylopandas
  3. pip install -e .

Dependencies

  • BioPython: Library for managing and manipulating biological data.
  • DendroPy: Library for phylogenetic scripting, simulation, data processing and manipulation
  • Pandas: Flexible and powerful data analysis / manipulation library for Python
  • pandas_flavor: Flavor pandas objects with new accessors using pandas’ new register API (with backwards compatibility).