Now showing items 1-20 of 76

    • WikiCatSum 

      Perez-Beltrachini, Laura; Liu, Yang; Lapata, Mirella
      WikiCatSum is a domain specific Multi-Document Summarisation (MDS) dataset. It assumes the summarisation task of generating Wikipedia lead sections for Wikipedia entities of a certain domain (e.g. Companies) from the set ...
    • ASVspoof 2019: The 3rd Automatic Speaker Verification Spoofing and Countermeasures Challenge database 

      Yamagishi, Junichi; Todisco, Massimiliano; Sahidullah, Md; Delgado, Héctor; Wang, Xin; Evans, Nicolas; Kinnunen, Tomi; Lee, Kong Aik; Vestman, Ville; Nautsch, Andreas
      This is a database used for the Third Automatic Speaker Verification Spoofing and Countermeasures Challenge, for short, ASVspoof 2019 (http://www.asvspoof.org) organized by Junichi Yamagishi, Massimiliano Todisco, Md ...
    • A Survey on Developer-Centred Security 

      Tahaei, Mohammad; Vaniea, Kami
      Our research reports a systematic literature review of 49 publications on security studies with software developer participants. These attached files are: - A BibTeX file: includes all 49 references in BibTex format. - ...
    • ManySStuBs4J Dataset 

      Karampatsis, Rafael-Michael
      The ManySStuBs4J corpus contains simple statement bugs mined from open-source Java projects hosted in GitHub. There are two variations of the dataset. One mined from the 100 Java Maven Projects and one mined from the top ...
    • Listening-test materials for "Modern speech synthesis for phonetic sciences: a discussion and an evaluation" 

      Malisz, Zofia; Henter, Gustav Eje; Valentini-Botinhao, Cassia; Watts, Oliver; Beskow, Jonas; Gustafson, Joakim
      This data release contains listening-test materials associated with the paper "Modern speech synthesis for phonetic sciences: a discussion and an evaluation", presented at ICPhS 2019 in Melbourne, Australia.
    • Alba speech corpus 

      Valentini-Botinhao, Cassia; Yamagishi, Junichi
      Single speaker read speech corpus of a Scottish accented female native English speaker (Alba). The corpus was recorded in four speaking styles: plain (normal read speech, around 4 hours of recordings), fast (speaking as ...
    • Listening test results of the Voice Conversion Challenge 2018 

      Yamagishi, Junichi; Wang, Xin
      This dataset is associated with a paper and a dataset below: (1) Jaime Lorenzo-Trueba, Junichi Yamagishi, Tomoki Toda, Daisuke Saito, Fernando Villavicencio, Tomi Kinnunen, Zhenhua Ling, "The Voice Conversion Challenge ...
    • UltraSuite Repository - sample data 

      Eshky, Aciel; Ribeiro, Manuel Sam; Cleland, Joanne; Renals, Steve; Richmond, Korin; Roxburgh, Zoe; Scobbie, James; Wrench, Alan
      UltraSuite is a repository of ultrasound and acoustic data from child speech therapy sessions. The current release includes three data collections, one from typically developing children -- Ultrax Typically Developing ...
    • Hurricane natural speech corpus - higher quality version 

      Valentini-Botinhao, Cassia; Mayo, Cassie; Cooke, Martin
      Single male native British-English talker recorded producing three speech sets (Harvard sentences, Modified Rhyme Test, news sentences) in quiet and while the talker was listening to speech-shaped noise at 84dB(A). This ...
    • Parallel Audiobook Corpus 

      Ribeiro, Manuel Sam
      The Parallel Audiobook Corpus (version 1.0) is a collection of parallel readings of audiobooks. The corpus consists of approximately 121 hours of speech at 22.05KHz across 4 books and 59 speakers. The data is provided in ...
    • CINIC-10 Is Not ImageNet or CIFAR-10 

      Darlow, Luke N; Crowley, Elliot J; Antoniou, Antreas; Storkey, Amos
      CINIC-10 is an augmented extension of CIFAR-10. It contains the images from CIFAR-10 (60,000 images, 32x32 RGB pixels) and a selection of ImageNet database images (210,000 images downsampled to 32x32). It was compiled as ...
    • Manual and automatic labels for version 1.0 of UXTD, UXSSD, and UPX core data -- version 1.0 

      Eshky, Aciel; Ribeiro, Manuel Sam; Cleland, Joanne; Renals, Steve; Richmond, Korin; Roxburgh, Zoe; Scobbie, James; Wrench, Alan
      UltraSuite is a repository of ultrasound and acoustic data from child speech therapy sessions. The current release includes three data collections, one from typically developing children (UXTD) and two from children with ...
    • The Voice Conversion Challenge 2018: database and results 

      Lorenzo-Trueba, Jaime; Yamagishi, Junichi; Toda, Tomoki; Saito, Daisuke; Villavicencio, Fernando; Kinnunen, Tomi; Ling, Zhenhua
      Voice conversion (VC) is a technique to transform a speaker identity included in a source speech waveform into a different one while preserving linguistic information of the source speech waveform. In 2016, we have ...
    • The 2nd Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2017) Database, Version 2 

      Kinnunen, Tomi; Sahidullah, Md; Delgado, Héctor; Todisco, Massimiliano; Evans, Nicholas; Yamagishi, Junichi; Lee, Kong Aik
      This is a database used for the Second Automatic Speaker Verification Spoofing and Countermeasuers Challenge, for short, ASVspoof 2017 (http://www.asvspoof.org) organized by Tomi Kinnunen, Md Sahidullah, Héctor Delgado, ...
    • Device Recorded VCTK (Small subset version) 

      Sarfjoo, Seyyed Saeed; Yamagishi, Junichi
      This dataset is a new variant of the voice cloning toolkit (VCTK) dataset: device-recorded VCTK (DR-VCTK), where the high-quality speech signals recorded in a semi-anechoic chamber using professional audio devices are ...
    • SUPERSEDED - The 2nd Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2017) Database, Version 2 

      Kinnunen, Tomi; Sahidullah, Md; Delgado, Héctor; Todisco, Massimiliano; Evans, Nicholas; Yamagishi, Junichi; Lee, Kong Aik
      ## This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2332 ##
    • SUPERSEDED - The 2nd Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2017) Database, Version 2 

      Kinnunen, Tomi; Sahidullah, Md; Delgado, Héctor; Todisco, Massimiliano; Evans, Nicholas; Yamagishi, Junichi; Lee, Kong Aik
      ## This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2332 ##
    • Dutch English Lombard Speech Native and Non-Native (DELNN) 

      Marcoux, Katherine; Ernestus, Mirjam; King, Simon
      The DELNN (Dutch English Lombard speech Native and Non-Native) corpus consists of 30 native Dutch speakers reading sentences in a quiet environment and in a noisy environment, to elicit Lombard speech. The Dutch speakers ...
    • Radboud Lombard Corpus_Dutch 

      Shen, Chen; Janse, Esther; King, Simon
      This data set contains 54 (12 for now) native Dutch speakers' Dutch sentence-reading material (48 sentences in natural and 48 sentences in Lombard condition per speaker).
    • Triangulating Context Lemmas 

      McLaughlin, Craig; McKinna, James; Stark, Ian
      Agda formalisation to accompany the paper "Triangulating Context Lemmas" by Craig McLaughlin, James McKinna and Ian Stark. DOI 10.1145/3167081.