Noisy speech database for training speech enhancement algorithms and TTS models
Data CreatorValentini-Botinhao, Cassia
PublisherUniversity of Edinburgh. School of Informatics. Centre for Speech Technology Research (CSTR)
Relation (Is Referenced By)http://www.research.ed.ac.uk/portal/en/publications/speech-enhancement-for-a-noiserobust-texttospeech-synthesis-system-using-deep-recurrent-neural-networks(08deb6fd-79c0-490f-ae46-f37034b6bfb4).html
MetadataShow full item record
CitationValentini-Botinhao, Cassia. (2017). Noisy speech database for training speech enhancement algorithms and TTS models, 2016 [sound]. University of Edinburgh. School of Informatics. Centre for Speech Technology Research (CSTR). http://dx.doi.org/10.7488/ds/2117.
DescriptionClean and noisy parallel speech database. The database was designed to train and test speech enhancement methods that operate at 48kHz. A more detailed description can be found in the papers associated with the database. For the 28 speaker dataset, details can be found in: C. Valentini-Botinhao, X. Wang, S. Takaki & J. Yamagishi, "Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System using Deep Recurrent Neural Networks", In Proc. Interspeech 2016. For the 56 speaker dataset: C. Valentini-Botinhao, X. Wang, S. Takaki & J. Yamagishi, "Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech”, In Proc. SSW 2016. Some of the noises used to create the noisy speech were obtained from the Demand database, available here: http://parole.loria.fr/DEMAND/ . The speech database was obtained from the CSTR VCTK Corpus, available here: http://dx.doi.org/10.7488/ds/1994. The speech-shaped and babble noise files that were used to create this dataset are available here: http://homepages.inf.ed.ac.uk/cvbotinh/se/noises/.
The following licence files are associated with this item:
Showing items related by title, author, creator and subject.
Mayo, Catherine (LISTA Consortium: (i) Language and Speech Laboratory, Universidad del Pais Vasco, Spain and Ikerbasque, Spain; (ii) Centre for Speech Technology Research, University of Edinburgh, UK; (iii) KTH Royal Institute of Technology, Sweden; (iv) Institute of Computer Science, FORTH, Greece, 2013-09-24)Single male native British English talker recorded producing 25 TIMIT sentences in 5 conditions, two natural: (i) quiet, (ii) while the talker listened to high-intensity speech-shaped noise, and three acted: (i) as if to ...
Listening test materials for "Evaluating comprehension of natural and synthetic conversational speech" Wester, Mirjam; Watts, Oliver; Henter, Gustav EjeCurrent speech synthesis methods typically operate on isolated sentences and lack convincing prosody when generating longer segments of speech. Similarly, prevailing TTS evaluation paradigms, such as intelligibility ...
Sarfjoo, Seyyed Saeed; Yamagishi, Junichi## This item has been replaced by the one which can be found at http://dx.doi.org/10.7488/ds/2316 ## This dataset is a new variant of the voice cloning toolkit (VCTK) dataset: device-recorded VCTK (DR-VCTK), where the ...