104 Works

C M Taylor Keylogging Data: 07 Jan 2015 – 09 Feb 2015

C M Taylor
This dataset is comprised of keylogging data from the author C M Taylor captured October 2014 – 9 February 2015; Keystroke files: 07/01/2015 – 09/02/2015.The dataset is comprised of screenshots and keystroke logs. The screenshots are saved individually as JPGs and BMPs as well as an AVI file, so the individual captures play as a film. Keystrokes are saved either as .rtf files or .txt.

Digitised Books - Images identified as Embellishments. c. 1510 - c. 1946. JPG.

British Library Labs
The dataset comprises c. 41,6951 images identified as ‘Embellishments’ from the British Library's Flickr Commons collections, dating between c. 1510 - c. 1946. The images were algorithmically gathered from 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1946. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The images are in .JPEG format.

Digitised Hebrew Manuscripts: Or 2626 - Or 6425

British Library
This dataset comprises of 43 digitised Hebrew manuscripts (920 - 1845; unknown date), with their shelfmarks in alphabetical order (Or 2626 - Or 6425). These manuscripts are out of copyrights.

Volumes of Lysons Collectanea (Trades), comprising advertisements, cuttings, and illustrations relating to trades, professions, medical cures. 1660-1825.

British Library Labs
The dataset comprises four digitised volumes of a collection of advertisements, cuttings and illustrations relating to trades, professions and medical cures from 1660 - 1825 (with OCR-derived text.)

OCR text derived from digitised books published 1830 - 1839. ALTO XML.

British Library Labs
This set consists 2639 volumes, published between 1830-1839. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO) Extensible Markup Language (XML) format.

Pelagios Project: Digitised Insularium Illustratum. Additional MS 15760

British Library
This dataset comprises of 123 images from the Insularium Illustratum, an account of the islands of the Mediterranean, and of some others produced by Henricus Martellus Germanus in 1495. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project.

C M Taylor Keylogging Data: 17 April 2016 – 22 July 2016

C M Taylor
This dataset is comprised of keylogging data from the author C M Taylor captured April – May 2016; Keystroke files: 17/04/2016 – 22/07/2016.The dataset is comprised of screenshots and keystroke logs. The screenshots are saved individually as JPGs and BMPs as well as an AVI file, so the individual captures play as a film. Keystrokes are saved either as .rtf files or .txt.

Pelagios Project: Digitised Liber insularum Arcipelagi Cotton MS Vespasian a.XIII.art.1

British Library
This dataset comprises of 82 images from the Liber insularum Arcipegelagi, an illustrated account of the islands and major ports of the Mediterranean produced by Christophori Bondelmonti around 1422. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project.

Judicial Committee of the Privy Council: Linked Appeals Data

from a spreadsheet created by Jonathan Sims and Sophie Flynn Piercy Linked Data produced by Sarah Middle
Linked Data about appeal cases heard by the Judicial Committee of the Privy Council between 1860 and 1998. Personal and organisation names have been reconciled to VIAF and Wikidata, and place names have been reconciled to Geonames (where possible).

AAS Card Catalogues: Punjabi

British Library
This dataset contains digitised microfilms of Punjabi card catalogues (1961-1983).

Volumes of Madden's cuttings, views, and pamphlets about the British Museum. 1755-1870.

British Library Labs
The dataset comprises four digitised volumes of a collection of cuttings, views and pamphlets made by Sir Frederic Madden about the British Museum, dating 1755 - 1870 (with OCR-derived text.)

Digitised Hebrew Manuscripts: Harley 5772 to Or 14580

British Library
This dataset comprises 42 digitised Hebrew manuscripts (1200 - 1871), with their shelfmarks in alphabetical order (Harley 5772 - Or 14580). These manuscripts are out of copyright.

OCR text derived from digitised books published 1860 - 1869. ALTO XML.

British Library Labs
This set consists 7498 volumes, published between 1860-1869. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO) Extensible Markup Language (XML) format.

Volumes of Lysons Collectanea (Amusements), comprising broadsides, cuttings, advertisements on amusements.1660-1840.

British Library Labs
The dataset comprises nine digitised volumes of a collection of broadsides, cuttings and advertisements, relating to public exhibitions and places of amusement from 1660 - 1840 (with OCR-derived text.)

OCR text derived from digitised books published 1700 - 1799. ALTO XML.

British Library Labs
This set consists 2070 volumes, published between 1700-1799. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO) Extensible Markup Language (XML) format.

OCR text derived from digitised books published 1850 - 1859. ALTO XML.

British Library Labs
This set consists 5818 volumes, published between 1850-1859. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO) Extensible Markup Language (XML) format.

Theatrical playbills from Britain and Ireland.

British Library Labs
264 volumes, PDF format with embedded OCR-derived text. The dataset comprises 264 volumes of digitised theatrical playbills published between 1660 – 1902 (mostly 19th century) from England, Scotland, Wales and Ireland. Digitised from the British Library's physical collection of over 500 volumes of playbills. The dataset in Portable Document Format (PDF). The playbills cover theatres in Bath (Royal), Bristol (Royal), Dublin (Royal), Edinburgh (miscellaneous), Hull (Royal), King's Lynn, Liverpool (Royal), London (Covent Garden, Drury Lane,...

Digitised Hebrew Manuscripts: Or 2510 - Or 2588

British Library
This dataset comprises of 40 digitised Hebrew manuscripts (900 - 1747; unknown date), with their shelfmarks in alphabetical order (Or 2510 - Or 2588). These manuscripts are out of copyrights.

OCR text derived from digitised books published c. 1510 - 1699. ALTO XML.

British Library Labs
This set consists 693 volumes, published between c. 1510 - 1699. The dataset comprises text from the collection of digitised books created using Optical Character Recognition (OCR) technology. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in Analysed Layout and Text Object (ALTO) Extensible Markup Language (XML) format.

Digitised Books. c. 1510 - c. 1946. JSON (OCR derived text).

British Library Labs
The dataset comprises text created by OCR from the 49,455 digitised books, equating to 65,227 volumes (25+ million pages), published between c. 1510 - c. 1946. The books cover a wide range of subject areas including philosophy, history, poetry and literature. The dataset is in JavaScript Object Notation (JSON) text format. Links metadata, PDFs, Flickr images, digital versions

AAS Card Catalogues: Sindhi

British Library
This dataset contains digitised microfilms of Sindhi card catalogues.

Pelagios Project: Liber insularum Cycladum. Arundel MS 93.art.7

British Library
This dataset comprises of 45 images from the Liber insularum Cycladum produced by Christophori Bondelmonti around 1422. The digitisation was sponsored by A. W. Mellon Foundation through the Pelagios Project.

AAS Card Catalogues: Telugu

British Library
This dataset contains digitised microfilms of Telugu card catalogues.

Digitised Hebrew Manuscripts: Or 2406 - Or 2509

British Library
This dataset comprises of 41 digitised Hebrew manuscripts (1300 - 1799; unknown date), with their shelfmarks in alphabetical order (Or 2406 - Or 2509). These manuscripts are out of copyrights.

AAS Card Catalogues: Sanskrit

British Library
This dataset contains digitised microfilms of Sanskrit card catalogues (1926-1983).

Registration Year

  • 2016
    44
  • 2017
    43
  • 2018
    17

Resource Types

  • Dataset
    9
  • Image
    5