项目作者: r9y9

项目描述 :
PyTorch implementation of Tacotron speech synthesis model.
高级语言: Jupyter Notebook
项目地址: git://github.com/r9y9/tacotron_pytorch.git
创建时间: 2017-09-15T11:42:38Z
项目社区:https://github.com/r9y9/tacotron_pytorch

开源协议:Other

下载


tacotron_pytorch

Build Status

PyTorch implementation of Tacotron speech synthesis model.

Inspired from keithito/tacotron. Currently not as much good speech quality as keithito/tacotron can generate, but it seems to be basically working. You can find some generated speech examples trained on LJ Speech Dataset at here.

If you are comfortable working with TensorFlow, I’d recommend you to try
https://github.com/keithito/tacotron instead. The reason to rewrite it in PyTorch is that it’s easier to debug and extend (multi-speaker architecture, etc) at least to me.

Requirements

  • PyTorch
  • TensorFlow (if you want to run the training script. This definitely can be optional, but for now required.)

Installation

  1. git clone --recursive https://github.com/r9y9/tacotron_pytorch
  2. pip install -e . # or python setup.py develop

If you want to run the training script, then you need to install additional dependencies.

  1. pip install -e ".[train]"

Training

The package relis on keithito/tacotron for text processing, audio preprocessing and audio reconstruction (added as a submodule). Please follows the quick start section at https://github.com/keithito/tacotron and prepare your dataset accordingly.

If you have your data prepared, assuming your data is in "~/tacotron/training" (which is the default), then you can train your model by:

  1. python train.py

Alignment, predicted spectrogram, target spectrogram, predicted waveform and checkpoint (model and optimizer states) are saved per 1000 global step in checkpoints directory. Training progress can be monitored by:

  1. tensorboard --logdir=log

Testing model

Open the notebook in notebooks directory and change checkpoint_path to your model.