Reverberant speech database for training speech dereverberation algorithms and TTS models
Data CreatorValentini-Botinhao, Cassia
PublisherUniversity of Edinburgh
MetadataShow full item record
CitationValentini-Botinhao, Cassia. (2016). Reverberant speech database for training speech dereverberation algorithms and TTS models, 2016 [dataset]. University of Edinburgh. http://dx.doi.org/10.7488/ds/1425.
DescriptionReverberant speech database. The database was designed to train and test speech dereverberation methods that operate at 48kHz. Clean speech was made reverberant by convolving it with a room impulse response. The room impulse responses used to create this dataset were selected from: - The ACE challenge (http://www.commsp.ee.ic.ac.uk/~sap/projects/ace-challenge/); - The MIRD database (http://www.iks.rwth-aachen.de/en/research/tools-downloads/multichannel-impulse-response-database/); - The MARDY database (http://www.commsp.ee.ic.ac.uk/~sap/resources/mardy-multichannel-acoustic-reverberation-database-at-york-database/). The underlying clean speech data can be found in: http://dx.doi.org/10.7488/ds/2117.
Reverberant speech 48kHz waveforms containing 2 native English speakers with around 400 sentences each (Test set) (151.7Mb)
Reverberant speech 48kHz waveforms containing 28 native English speakers with around 400 sentences each (Train set 1) (2.427Gb)
Reverberant speech 48kHz waveforms containing 56 native English speakers with around 400 sentences each (Train set 2) (4.899Gb)
4 text files describing the conditions under which each audio file was created (underlying speech and reverberant condition) (125.8Kb)
Showing items related by title, author, creator and subject.
Mayo, Catherine (LISTA Consortium: (i) Language and Speech Laboratory, Universidad del Pais Vasco, Spain and Ikerbasque, Spain; (ii) Centre for Speech Technology Research, University of Edinburgh, UK; (iii) KTH Royal Institute of Technology, Sweden; (iv) Institute of Computer Science, FORTH, Greece, 2013-09-24)Single male native British English talker recorded producing 25 TIMIT sentences in 5 conditions, two natural: (i) quiet, (ii) while the talker listened to high-intensity speech-shaped noise, and three acted: (i) as if to ...
Sarfjoo, Seyyed Saeed; Yamagishi, Junichi## This item has been replaced by the one which can be found at http://dx.doi.org/10.7488/ds/2316 ## This dataset is a new variant of the voice cloning toolkit (VCTK) dataset: device-recorded VCTK (DR-VCTK), where the ...
Listening test materials for "Evaluating comprehension of natural and synthetic conversational speech" Wester, Mirjam; Watts, Oliver; Henter, Gustav EjeCurrent speech synthesis methods typically operate on isolated sentences and lack convincing prosody when generating longer segments of speech. Similarly, prevailing TTS evaluation paradigms, such as intelligibility ...