The Centre for Speech Technology Research (CSTR) is an interdisciplinary research centre linking Informatics and Linguistics and English Language .

Founded in 1984, CSTR is concerned with research in all areas of speech technology including speech recognition, speech synthesis, speech signal processing, information access, multimodal interfaces and dialogue systems. We have many collaborations with the wider community of researchers in speech science, language, cognition and machine learning for which Edinburgh is renowned.

Collections in this community

Recent Submissions

  • Device Recorded VCTK (Small subset version) 

    Sarfjoo, Seyyed Saeed; Yamagishi, Junichi
    This dataset is a new variant of the voice cloning toolkit (VCTK) dataset: device-recorded VCTK (DR-VCTK), where the high-quality speech signals recorded in a semi-anechoic chamber using professional audio devices are ...
  • The Blizzard Challenge 2017 

    Ronanki, Srikanth; EPSRC - Engineering and Physical Sciences Research Council
    The Blizzard Challenge 2017 was the thirteenth annual Blizzard Challenge and was once again organised by Simon King at the University of Edinburgh, with support from the other members of the Blizzard Challenge committee ...
  • Noisy reverberant speech database for training speech enhancement algorithms and TTS models 

    Valentini-Botinhao, Cassia
    Noisy reverberant speech database. The database was designed to train and test speech enhancement (noise suppression and dereverberation) methods that operate at 48kHz. Clean speech was made reverberant and noisy by ...
  • Noisy speech database for training speech enhancement algorithms and TTS models 

    Valentini-Botinhao, Cassia
    Clean and noisy parallel speech database. The database was designed to train and test speech enhancement methods that operate at 48kHz. A more detailed description can be found in the papers associated with the database. ...
  • The 2nd Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2017) Database 

    Kinnunen, Tomi; Sahidullah, Md; Delgado, Héctor; Todisco, Massimiliano; Evans, Nicholas; Yamagishi, Junichi; Lee, Kong Aik
    This is a database used for the Second Automatic Speaker Verification Spoofing and Countermeasuers Challenge, for short, ASVspoof 2017 (http://www.asvspoof.org) organized by Tomi Kinnunen, Md Sahidullah, Héctor Delgado, ...
  • 96kHz version of the CSTR VCTK Corpus 

    Veaux, Christophe; Yamagishi, Junichi
    This dataset includes 96kHz version of the CSTR VCTK Corpus including speech data uttered by 109 native speakers of English with various accents. The main dataset can be found at http://dx.doi.org/10.7488/ds/1994 (containing ...
  • Key files for Spoofing and Anti-Spoofing (SAS) corpus v1.0 

    Wu, Zhizheng; Khodabakhsh, Ali; Demiroglu, Cenk; Yamagishi, Junichi; Saito, Daisuke; Toda, Tomoki; Ling, Zhen-Hua; King, Simon
    These files are complementary to the fileset: Wu et al. (2015). Spoofing and Anti-Spoofing (SAS) corpus v1.0, [dataset]. University of Edinburgh. The Centre for Speech Technology Research (CSTR). http://dx.doi.org/10.7488/ds/252. ...
  • Thesis Material - Rasmus Dall 

    Dall, Rasmus
    Data released in relation to the PhD thesis of Rasmus Dall. This contains: 1. Thesis pdf. 2. Released parallel corpora of read and spontaneous speech suitable for speech synthesis. 3. Experimental Data to enable ...
  • CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit 

    Veaux, Christophe; Yamagishi, Junichi; MacDonald, Kirsten
    This CSTR VCTK Corpus (Centre for Speech Technology Voice Cloning Toolkit) includes speech data uttered by 109 native speakers of English with various accents. 96kHz versions of the recordings are available at http://dx. ...
  • The SIWIS French Speech Synthesis Database 

    Yamagishi, Junichi; Honnet, Pierre-Edouard; Garner, Philip; Lazaridis, Alexandros
    The SIWIS French Speech Synthesis Database includes high quality French speech recordings and associated text files, aimed at building TTS systems, investigate multiple styles, and emphasis. A total of 9750 utterances from ...
  • The Voice Conversion Challenge 2016 

    Tomoki, Toda; Chen, Ling-Hui; Saito, Daisuke; Villavicencio, Fernando; Wester, Mirjam; Wu, Zhizheng; Yamagishi, Junichi
    The Voice Conversion Challenge (VCC) 2016, one of the special sessions at Interspeech 2016, deals with speaker identity conversion, referred as Voice Conversion (VC). The task of the challenge was speaker conversion, i.e., ...
  • The Voice Conversion Challenge, 2016: multidimensional scaling (MDS) listening test results 

    Toda, Tomoki; Chen, Ling-Hui; Saito, Daisuke; Villavicencio, Fernando; Wester, Mirjam; Wu, Zhizheng; Yamagishi, Junichi
    The Voice Conversion Challenge (VCC) 2016, one of the special sessions at Interspeech 2016, deals with speaker identity conversion, referred as Voice Conversion (VC). The task of the challenge was speaker conversion, i.e., ...
  • SUPERSEDED - CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit 

    Veaux, Christophe; Yamagishi, Junichi; MacDonald, Kirsten
    # SUPERSEDED - This item has been replaced by the one which can be found at http://dx.doi.org/10.7488/ds/1994 . # This CSTR VCTK Corpus (Centre for Speech Technology Voice Cloning Toolkit) includes speech data uttered by ...
  • Listening test materials for "A template-based approach for speech synthesis intonation generation using LSTMs" 

    Ronanki, Srikanth; Henter, Gustav Eje; Wu, Zhizheng; King, Simon
    This data release contains listening test materials associated with the paper "A template-based approach for speech synthesis intonation generation using LSTMs", presented at Interspeech 2016 in San Francisco, USA.
  • Listening test materials for "Waveform generation based on signal reshaping for statistical parametric speech synthesis" 

    Espic, Felipe; Valentini-Botinhao, Cassia; Wu, Zhizheng; King, Simon
    The dataset contains the testing stimuli and listeners' MUSHRA test responses for the Interspeech 2016 paper "Waveform generation based on signal reshaping for statistical parametric speech synthesis". On this paper, we ...
  • SUPERSEDED - The Voice Conversion Challenge 2016 

    Toda, Tomoki; Chen, Ling-Hui; Saito, Daisuke; Villavicencio, Fernando; Wester, Mirjam; Wu, Zhizheng; Yamagishi, Junichi
    THIS VERSION HAS BEEN REPLACED DUE TO SOME OF THE FILES BEING CORRUPTED. PLEASE SEE THE NEW VERSION OF THIS DATASET AT http://dx.doi.org/10.7488/ds/1575 . > The Voice Conversion Challenge (VCC) 2016, one of the special ...
  • Reverberant speech database for training speech dereverberation algorithms and TTS models 

    Valentini-Botinhao, Cassia
    Reverberant speech database. The database was designed to train and test speech dereverberation methods that operate at 48kHz. Clean speech was made reverberant by convolving it with a room impulse response. The room impulse ...
  • ## SUPERSEDED: THIS DATASET HAS BEEN REPLACED. ## Noisy speech database for training speech enhancement algorithms and TTS models 

    Valentini-Botinhao, Cassia
    ## SUPERSEDED: THIS DATASET HAS BEEN REPLACED by the one which can be found at http://dx.doi.org/10.7488/ds/2117. ## Clean and noisy parallel speech database. The database was designed to train and test speech enhancement ...
  • Listening test materials for "Evaluating comprehension of natural and synthetic conversational speech" 

    Wester, Mirjam; Watts, Oliver; Henter, Gustav Eje
    Current speech synthesis methods typically operate on isolated sentences and lack convincing prosody when generating longer segments of speech. Similarly, prevailing TTS evaluation paradigms, such as intelligibility ...
  • Listening test materials for "Robust TTS duration modelling using DNNs" 

    Henter, Gustav Eje; Ronanki, Srikanth; Watts, Oliver; Wester, Mirjam; Wu, Zhizheng; King, Simon
    This data release contains listening test materials associated with the paper "Robust TTS duration modelling using DNNs", presented at ICASSP 2016 in Shanghai, China.

View all