Lightweight PyTorch implementation of QuartzNet (https://arxiv.org/pdf/1910.10261.pdf).
Clone the repositorygit clone https://github.com/ivankunyankin/quartznet-asr.git
cd quartznet-asr
Create an environment
and install the dependenciespython3 -m venv env
source env/bin/activate
pip3 install -r requirements.txt
This guide shows training QuartzNet5x5 model using a part of LibriTTS dataset.
If you encounter an error, give the execute permission to your script and rerun the above command:
assets/data.sh
chmod +x assets/data.sh
The script will download and unzip the following subsets of LibriTTS: train-clean-360 (for training), dev-clean (for validation), test-clean (for testing)
Warning. This subset of the dataset requires around 30 Gb of storage space.
python3 train.py
Add --from_checkpoint
flag to continue training from a checkpoint.
python3 test.py
The code will test the trained model on test-clean subset of LibriTTS.
It will print the resulting WER (word error rate) and CTC loss values as well as save intermediate logs in the logs directory
tensorboard --logdir logs
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change
I found inspiration for TextTransform class and Greedy decoder in this post.