项目作者: vpc-ccg

项目描述 :
Sensitive and Fast Alignment Search Tool for Long Read sequencing Data.
高级语言: C
项目地址: git://github.com/vpc-ccg/lordfast.git
创建时间: 2017-05-04T01:45:55Z
项目社区:https://github.com/vpc-ccg/lordfast

开源协议:GNU General Public License v3.0

下载


lordFAST: sensitive and Fast Alignment Search Tool for LOng noisy Read sequencing Data

lordFAST is a sensitive tool for mapping long reads with high error rates.
lordFAST is specially designed for aligning reads from PacBio sequencing technology but provides the user the ability to change alignment parameters depending on the reads and application.

How to install?

Build Status

Requirements

  • GCC ≥ 4.4.7
  • zlib

Using conda

lordFast can be installed using conda package manager via bioconda channel:

  1. $ conda install -c bioconda lordfast

From source code

In order to build lordFAST, please download the latest release from https://github.com/vpc-ccg/lordfast/releases or alternatively clone the repository by running the following command:

  1. $ git clone https://github.com/vpc-ccg/lordfast.git

Now the code can be compiled easily by running make command line which builds the binary file lordfast.

  1. $ cd lordfast
  2. $ make

How to run?

SYNOPSIS

  1. lordfast --index FILE [OPTIONS]
  2. lordfast --search FILE --seq FILE [OPTIONS]

OPTIONS

Run lordfast -h or man ./HELP.man to see available options.

Indexing options

  1. -I, --index STR
  2. Path to the reference genome file in FASTA format which is supposed to be indexed. [required]

Mapping options

  1. -S, --search STR
  2. Path to the reference genome file in FASTA format. [required]
  3. -s, --seq STR
  4. Path to the file containing read sequences in FASTA/FASTQ format. [required]
  5. -o, --out STR
  6. Write output to STR file rather than standard output. [stdout]
  7. -t, --threads INT
  8. Use INT number of CPU cores. Pass 0 to use all the available cores. [1]

Advanced options

  1. -k, --minAnchorLen INT
  2. Minimum required length of anchors to be considered. [14]
  3. -n, --numMap INT
  4. Perform alignment for at most INT candidates. [10]
  5. -l, --minReadLen INT
  6. Do not try to map any read shorter than INT bp and report them as unmapped. [1000]
  7. -c, --anchorCount INT
  8. Consider INT anchoring positions on the long read. [1000]
  9. -m, --maxRefHit INT
  10. Ignore anchoring positions with more than INT reference hits. [1000]
  11. -R, --readGroup STR
  12. SAM read group line in a format like '@RG\tID:foo\tSM:bar'. []
  13. -a, --chainAlg INT
  14. Chaining algorithm to use. Options are "dp-n2" and "clasp". [dp-n2]
  15. --noSamHeader
  16. Do not print sam header in the output.

Other options

  1. -h, --help
  2. Prints this help file.
  3. -v, --version
  4. Prints the version of software.

EXAMPLES

Indexing reference genome:

  1. $ ./lordfast --index refgen.fasta

Mapping to the reference genome:

  1. $ ./lordfast --search refgen.fa --seq reads.fastq > map.sam
  2. $ ./lordfast --search refgen.fa --seq reads.fastq --threads 4 > map.sam

Publication

Haghshenas E., Sahinalp S.C. and Hach F., “lordFAST: sensitive and Fast Alignment Search Tool for LOng noisy Read sequencing Data” Bioinformatics (2018) DOI: 10.1093/bioinformatics/bty544

Bugs

Please report the bugs through lordFAST’s issues page at https://github.com/vpc-ccg/lordfast/issues.

Contact

Ehsan Haghshenas (ehaghshe AT sfu DOT ca)

This software is released under GNU General Public License (v3.0)\
Copyright (c) 2018 Simon Fraser University

  • BWA (used for the BWT-based index) is developed by Heng Li and is licensed under GPL
  • Edlib (used for global alignment) is developed by Martin Sosic and is licensed under MIT
  • ksw (used for alignment extension) is licensed under MIT
  • clasp (can be used for chaining) is developed and copyrighted by Christian Otto