Now showing items 1-20 of 79

    • Opinions on Weblinks 

      Albakry, Sara; Vaniea, Kami; Wolters, Maria
      This document provides a concrete version of the survey used in the URL reading experiment conducted in April 2017 and reported in the associated paper to appear at CHI'20 on 25 April, 2020. This documentation serves two ...
    • CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit (version 0.92) 

      Yamagishi, Junichi; Veaux, Christophe; MacDonald, Kirsten
      This CSTR VCTK Corpus includes speech data uttered by 110 English speakers with various accents. Each speaker reads out about 400 sentences, which were selected from a newspaper, the rainbow passage and an elicitation ...
    • ManySStuBs4J Dataset 

      Karampatsis, Rafael-Michael
      The ManySStuBs4J corpus contains simple statement bugs mined from open-source Java projects hosted in GitHub. There are two variations of the dataset. One mined from the 100 Java Maven Projects and one mined from the top ...
    • WikiCatSum 

      Perez-Beltrachini, Laura; Liu, Yang; Lapata, Mirella
      WikiCatSum is a domain specific Multi-Document Summarisation (MDS) dataset. It assumes the summarisation task of generating Wikipedia lead sections for Wikipedia entities of a certain domain (e.g. Companies) from the set ...
    • ASVspoof 2019: The 3rd Automatic Speaker Verification Spoofing and Countermeasures Challenge database 

      Yamagishi, Junichi; Todisco, Massimiliano; Sahidullah, Md; Delgado, Héctor; Wang, Xin; Evans, Nicolas; Kinnunen, Tomi; Lee, Kong Aik; Vestman, Ville; Nautsch, Andreas
      This is a database used for the Third Automatic Speaker Verification Spoofing and Countermeasures Challenge, for short, ASVspoof 2019 (http://www.asvspoof.org) organized by Junichi Yamagishi, Massimiliano Todisco, Md ...
    • A Survey on Developer-Centred Security 

      Tahaei, Mohammad; Vaniea, Kami
      Our research reports a systematic literature review of 49 publications on security studies with software developer participants. These attached files are: - A BibTeX file: includes all 49 references in BibTex format. - ...
    • SUPERSEDED - ManySStuBs4J Dataset 

      Karampatsis, Rafael-Michael
      ## This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2628 ## The ManySStuBs4J corpus contains simple statement bugs mined from open-source Java projects hosted in GitHub. There are ...
    • Listening-test materials for "Modern speech synthesis for phonetic sciences: a discussion and an evaluation" 

      Malisz, Zofia; Henter, Gustav Eje; Valentini-Botinhao, Cassia; Watts, Oliver; Beskow, Jonas; Gustafson, Joakim
      This data release contains listening-test materials associated with the paper "Modern speech synthesis for phonetic sciences: a discussion and an evaluation", presented at ICPhS 2019 in Melbourne, Australia.
    • Alba speech corpus 

      Valentini-Botinhao, Cassia; Yamagishi, Junichi
      Single speaker read speech corpus of a Scottish accented female native English speaker (Alba). The corpus was recorded in four speaking styles: plain (normal read speech, around 4 hours of recordings), fast (speaking as ...
    • Listening test results of the Voice Conversion Challenge 2018 

      Yamagishi, Junichi; Wang, Xin
      This dataset is associated with a paper and a dataset below: (1) Jaime Lorenzo-Trueba, Junichi Yamagishi, Tomoki Toda, Daisuke Saito, Fernando Villavicencio, Tomi Kinnunen, Zhenhua Ling, "The Voice Conversion Challenge ...
    • UltraSuite Repository - sample data 

      Eshky, Aciel; Ribeiro, Manuel Sam; Cleland, Joanne; Renals, Steve; Richmond, Korin; Roxburgh, Zoe; Scobbie, James; Wrench, Alan
      UltraSuite is a repository of ultrasound and acoustic data from child speech therapy sessions. The current release includes three data collections, one from typically developing children -- Ultrax Typically Developing ...
    • Hurricane natural speech corpus - higher quality version 

      Valentini-Botinhao, Cassia; Mayo, Cassie; Cooke, Martin
      Single male native British-English talker recorded producing three speech sets (Harvard sentences, Modified Rhyme Test, news sentences) in quiet and while the talker was listening to speech-shaped noise at 84dB(A). This ...
    • Parallel Audiobook Corpus 

      Ribeiro, Manuel Sam
      The Parallel Audiobook Corpus (version 1.0) is a collection of parallel readings of audiobooks. The corpus consists of approximately 121 hours of speech at 22.05KHz across 4 books and 59 speakers. The data is provided in ...
    • CINIC-10 Is Not ImageNet or CIFAR-10 

      Darlow, Luke N; Crowley, Elliot J; Antoniou, Antreas; Storkey, Amos
      CINIC-10 is an augmented extension of CIFAR-10. It contains the images from CIFAR-10 (60,000 images, 32x32 RGB pixels) and a selection of ImageNet database images (210,000 images downsampled to 32x32). It was compiled as ...
    • Manual and automatic labels for version 1.0 of UXTD, UXSSD, and UPX core data -- version 1.0 

      Eshky, Aciel; Ribeiro, Manuel Sam; Cleland, Joanne; Renals, Steve; Richmond, Korin; Roxburgh, Zoe; Scobbie, James; Wrench, Alan
      UltraSuite is a repository of ultrasound and acoustic data from child speech therapy sessions. The current release includes three data collections, one from typically developing children (UXTD) and two from children with ...
    • The Voice Conversion Challenge 2018: database and results 

      Lorenzo-Trueba, Jaime; Yamagishi, Junichi; Toda, Tomoki; Saito, Daisuke; Villavicencio, Fernando; Kinnunen, Tomi; Ling, Zhenhua
      Voice conversion (VC) is a technique to transform a speaker identity included in a source speech waveform into a different one while preserving linguistic information of the source speech waveform. In 2016, we have ...
    • The 2nd Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2017) Database, Version 2 

      Kinnunen, Tomi; Sahidullah, Md; Delgado, Héctor; Todisco, Massimiliano; Evans, Nicholas; Yamagishi, Junichi; Lee, Kong Aik
      This is a database used for the Second Automatic Speaker Verification Spoofing and Countermeasuers Challenge, for short, ASVspoof 2017 (http://www.asvspoof.org) organized by Tomi Kinnunen, Md Sahidullah, Héctor Delgado, ...
    • Device Recorded VCTK (Small subset version) 

      Sarfjoo, Seyyed Saeed; Yamagishi, Junichi
      This dataset is a new variant of the voice cloning toolkit (VCTK) dataset: device-recorded VCTK (DR-VCTK), where the high-quality speech signals recorded in a semi-anechoic chamber using professional audio devices are ...
    • SUPERSEDED - The 2nd Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2017) Database, Version 2 

      Kinnunen, Tomi; Sahidullah, Md; Delgado, Héctor; Todisco, Massimiliano; Evans, Nicholas; Yamagishi, Junichi; Lee, Kong Aik
      ## This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2332 ##
    • SUPERSEDED - The 2nd Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2017) Database, Version 2 

      Kinnunen, Tomi; Sahidullah, Md; Delgado, Héctor; Todisco, Massimiliano; Evans, Nicholas; Yamagishi, Junichi; Lee, Kong Aik
      ## This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2332 ##