We supply importer scripts for various publically available speech data sets. You can find these scripts in the STT repo under the
bin/ directory, but not all of the data sets are free. See
bin/import_librivox.py for an example of how to import and preprocess a large, free dataset for training with 🐸STT.