项目作者: EN10

项目描述 :
Speech to Text
高级语言: Python
项目地址: git://github.com/EN10/Speech-to-Text-WaveNet.git
创建时间: 2017-03-06T10:39:33Z
项目社区:https://github.com/EN10/Speech-to-Text-WaveNet

开源协议:

下载


Speech-to-Text-WaveNet

Based on: https://github.com/buriburisuri/speech-to-text-wavenet
I have included the asset folder with pre-trained model which is not included in original repository.

The pre-trained model is from here:
https://github.com/buriburisuri/speech-to-text-wavenet#pre-trained-models
The model was trained on the CSTR VCTK Corpus:
http://homepages.inf.ed.ac.uk/jyamagis/page3/page58/page58.html

Dependencies
-
The original dependancies are not 100% correct, as described here:
https://github.com/buriburisuri/speech-to-text-wavenet#dependencies
It seems to break with newer versions of tensorflow or sugartensor.

My Updated Dependancies File: https://github.com/EN10/STT/blob/master/requirements.txt

Working Dependancies
-
Works with:
pandas 0.19.2 (latest)
librosa to 0.5.0 (latest)
tqdm to 4.11.2 (latest)
tensorflow 1.0.0, 0.12.1 & 0.12.0 doesn’t work, only tensorflow 0.11.0.
sugartensor version > 0.0.1.9 doesn’t work, only 0.0.1.9 does.

Changing Dependancies
-
To see which version installed use:

  1. pip freeze
  2. pip show tensorflow

If a newer version is installed then uninstall:

  1. sudo pip uninstall sugartensor

Then install correct version:

  1. sudo pip install sugartensor==0.0.1.9

To install correct version of tensorflow:

  1. sudo pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.11.0-cp27-none-linux_x86_64.whl

Run
-

Use recognise using test file:

  1. python recognize.py --file test.wav

Other Issues
-

ImportError: No module named

  1. sudo -H pip install

Convert Audio:
http://superuser.com/questions/23930/how-to-decode-aac-m4a-audio-files-into-wav