The School of Informatics is the largest, longest established and highest quality research group in informatics in the UK.

Research within the School is carried out across a number of institutes. The research programmes organised by the School of Informatics encompass a wide range of domains. Currently these include Artificial Life, Bioinformatics, Computational Thinking, Machine Learning, Music Informatics, Processes, Events & Activity, Software Engineering and System Level Integration.

Sub-communities within this community

Collections in this community

Recent Submissions

  • Thesis Material - Rasmus Dall 

    Dall, Rasmus
    Data released in relation to the PhD thesis of Rasmus Dall. This contains: 1. Thesis pdf. 2. Released parallel corpora of read and spontaneous speech suitable for speech synthesis. 3. Experimental Data to enable ...
  • CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit 

    Veaux, Christophe; Yamagishi, Junichi; MacDonald, Kirsten
    This CSTR VCTK Corpus (Centre for Speech Technology Voice Cloning Toolkit) includes speech data uttered by 109 native speakers of English with various accents. Each speaker reads out about 400 sentences, most of which were ...
  • Path-ZVA Implementation 

    Reijsbergen, Daniel
    This Java project is an implementation of the algorithm presented in the paper 'Path-ZVA: general, efficient and automated importance sampling for highly reliable Markovian systems' by Daniël Reijsbergen, Pieter-Tjerk de ...
  • The SIWIS French Speech Synthesis Database 

    Yamagishi, Junichi; Honnet, Pierre-Edouard; Garner, Philip; Lazaridis, Alexandros
    The SIWIS French Speech Synthesis Database includes high quality French speech recordings and associated text files, aimed at building TTS systems, investigate multiple styles, and emphasis. A total of 9750 utterances from ...
  • GitHub Java Corpus 

    Allamanis, Miltiadis; Sutton, Charles
    The GitHub Java Corpus is a snapshot of all open-source Java code on GitHub in October 2012 that is contained in open-source projects that at the time had at least one fork. It contains code from 14,785 projects amounting ...
  • EEMBC FPMark Benchmark Suite Simulations 

    Tomusk, Erik
    This dataset contains gem5 simulation results and McPAT power consumption figures for 3000 out-of-order CPU cores running EEMBC FPMark benchmarks. The benchmarks have been compiled for the ARM ISA and have been simulated ...
  • Software and data for endoscopic sensing of alveolar pH 

    Choudhury, Debaditya; Tanner, Michael G.; McAughtrie, Sarah; Yu, Fei; Mills, Bethany; Choudhary, Tushar R.; Seth, Sohan; Craven, Thomas H.; Stone, James M.; Mati, Ioulia K.; Campbell, Colin J.; Bradley, Mark; Williams, Christopher K. I.; Dhaliwal, Kevin; Birks, Timothy A.; Thomson, Robert Roderick
    Previously unobtainable measurements of alveolar pH were obtained using an endoscope-deployable optrode. The pH sensing was achieved using functionalized gold nanoshell sensors and surface enhanced Raman spectroscopy (SERS). ...
  • SPEC 2006 Integer Benchmark Suite Simulations 

    Tomusk, Erik
    This dataset contains gem5 simulation results and McPAT power consumption figures for 3000 out-of-order CPU cores running SPEC 2006 integer benchmarks. The benchmarks have been compiled for the ARM ISA and have been simulated ...
  • The Voice Conversion Challenge 2016 

    Tomoki, Toda; Chen, Ling-Hui; Saito, Daisuke; Villavicencio, Fernando; Wester, Mirjam; Wu, Zhizheng; Yamagishi, Junichi
    The Voice Conversion Challenge (VCC) 2016, one of the special sessions at Interspeech 2016, deals with speaker identity conversion, referred as Voice Conversion (VC). The task of the challenge was speaker conversion, i.e., ...
  • Robust Data-driven Macro-socioeconomic-energy Model, 7see-GB, 2016 

    Roberts, Simon; Axon, Colin; Foran, Barney; Goddard, Nigel; Warr, Benjamin
    In a resource-constrained world with growing population and demand for energy, goods, and services with commensurate environmental impacts, we need to understand how these trends relate to various aspects of economic ...
  • EEMBC Benchmark Suite Simulations 

    Tomusk, Erik
    This dataset contains gem5 simulation results and McPAT power consumption figures for 3000 out-of-order CPU cores running EEMBC DENBench (digital entertainment) and Networking 2.0 benchmarks. The benchmarks have been ...
  • The Voice Conversion Challenge, 2016: multidimensional scaling (MDS) listening test results 

    Toda, Tomoki; Chen, Ling-Hui; Saito, Daisuke; Villavicencio, Fernando; Wester, Mirjam; Wu, Zhizheng; Yamagishi, Junichi
    The Voice Conversion Challenge (VCC) 2016, one of the special sessions at Interspeech 2016, deals with speaker identity conversion, referred as Voice Conversion (VC). The task of the challenge was speaker conversion, i.e., ...
  • SUPERSEDED - CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit 

    Veaux, Christophe; Yamagishi, Junichi; MacDonald, Kirsten
    # SUPERSEDED - This item has been replaced by the one which can be found at http://dx.doi.org/10.7488/ds/1994 . # This CSTR VCTK Corpus (Centre for Speech Technology Voice Cloning Toolkit) includes speech data uttered by ...
  • Analysis Software for Model Checking Edinburgh Buses 

    Reijsbergen, Daniel; Gao, Wulinjian
    This software is supplementary material for the paper 'An automated methodology for analysing urban transportation systems using model checking' by Daniël Reijsbergen and Stephen Gilmore. It was used to construct the figures ...
  • Lothian Buses Full Fleet GPS Traces, 2014 to 2015 

    Reijsbergen, Daniёl; European Commission
    These datasets have been collected and provided to us by Lothian Buses. They consist of Automatic Vehicle Location (AVL) data obtained using periodic GPS location measurements. Each data entry consists of a bus identifier, ...
  • Listening test materials for "A template-based approach for speech synthesis intonation generation using LSTMs" 

    Ronanki, Srikanth; Henter, Gustav Eje; Wu, Zhizheng; King, Simon
    This data release contains listening test materials associated with the paper "A template-based approach for speech synthesis intonation generation using LSTMs", presented at Interspeech 2016 in San Francisco, USA.
  • Listening test materials for "Waveform generation based on signal reshaping for statistical parametric speech synthesis" 

    Espic, Felipe; Valentini-Botinhao, Cassia; Wu, Zhizheng; King, Simon
    The dataset contains the testing stimuli and listeners' MUSHRA test responses for the Interspeech 2016 paper "Waveform generation based on signal reshaping for statistical parametric speech synthesis". On this paper, we ...
  • SUPERSEDED - The Voice Conversion Challenge 2016 

    Toda, Tomoki; Chen, Ling-Hui; Saito, Daisuke; Villavicencio, Fernando; Wester, Mirjam; Wu, Zhizheng; Yamagishi, Junichi
    THIS VERSION HAS BEEN REPLACED DUE TO SOME OF THE FILES BEING CORRUPTED. PLEASE SEE THE NEW VERSION OF THIS DATASET AT http://dx.doi.org/10.7488/ds/1575 . > The Voice Conversion Challenge (VCC) 2016, one of the special ...
  • Reverberant speech database for training speech dereverberation algorithms and TTS models 

    Valentini-Botinhao, Cassia
    Clean and reverberant parallel speech database. The database was designed to train and test speech dereverberation methods that operate at 48kHz. A more detailed description can be found in the paper associated with the ...
  • The Human Know-How Dataset 

    Pareti, Paolo; Klein, Ewan H.
    The Human Know-How Dataset describes 211,696 human activities from many different domains. These activities are decomposed into 2,609,236 entities (each with an English textual label). These entities represent over two ...

View all