Wong, Mark; Leng, Rhodri; Viry, Gil; Liscovsky Barrera, Rodrigo; Garcia-Sancho, Miguel. (2019). Human, yeast and pig genomics: sequence submissions and first sequence descriptions in the literature (1980-2015), 1980-2015 [dataset]. University of Edinburgh. School of Social and Political Science. Science, Technology and Innovation Studies. https://doi.org/10.7488/ds/2589.
This data collection is derived from two sources: 1) Submissions of DNA sequences of S. cerevisiae (yeast), Sus scrofa (pig) and Homo sapiens (human) to the European Nucleotide Archive, and 2) First description of these sequences in the scientific literature. The time range of the records is 1980-2000 (yeast), 1985-2005 (human) and 1990-2015 (pig). In total, each species has two associated datasets: 1) A .csv file documenting the PubMed ID of each article describing new sequences, all paper authors, all institutional affiliations of each author, country of institution, year of first submission to the European Nucleotide Archive, and the year of article publication, and 2) A .csv file documenting all institutions submitting to the European Nucleotide Archive, number of nucleotides sequenced, number of submissions per institution, and year of submission to the database. The approximate number of records is 28,000 publications and over 13 million sequence submissions. Some data about submitting institutions is not fully cleaned. We also include three files with the software codes that were used to obtain the submission and publication records.
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336, VAT Registration Number GB 592 9507 00, and is acknowledged by the UK authorities as a “Recognised body” which has been granted degree awarding powers.