项目作者: amaurypm

项目描述 :
Extract chains' and polypeptides' sequences from PDB or mmCIF files.
高级语言: Python
项目地址: git://github.com/amaurypm/struct2seq.git
创建时间: 2018-03-21T15:58:30Z
项目社区:https://github.com/amaurypm/struct2seq

开源协议:GNU General Public License v3.0

下载


struct2seq

Extract chains’ and polypeptides’ sequences from PDB or mmCIF files.

Only the residues with structural information are saved. Residues missing in the structure give place to the possible existence of more than one polypeptide in each chain.

Two file are generated by input structure, one with each individual polypeptide sequence, and other with whole chain sequences, where peptides sequences within the chain are concatenated.

Output files are in fasta format, with an ID composed by . for chain sequences and .. for peptides.

The chain sequences files are particularly useful when you need to align target sequences against structural templates for molecular similarity modeling (the reason why I wrote this program in the first place).

Usage

struct2seq [-h] [-v] structfile [structfile ...]

Installation

This is a Python script, so, you can just run the uniqseq.py file or put a symbolic link in any directory of your PATH (e.g. /usr/local/bin). The second option is recommend.

Dependencies

  • Python3
  • Biopython
  • argparse

Examples

struc2seq 3WKV.pdb

struct2seq 5zcs.cif 3WKV.pdb 5zd5.cif