
HATHI 1M: Introducing a Million Page Historical Prose Dataset in English from the Hathi Trust
Author(s) -
Sunyam Bagga,
Andrew Piper
Publication year - 2022
Publication title -
journal of open humanities data
Language(s) - English
Resource type - Journals
ISSN - 2059-481X
DOI - 10.5334/johd.71
Subject(s) - metadata , computer science , set (abstract data type) , feature (linguistics) , natural language processing , blank , word (group theory) , artificial intelligence , english language , linguistics , information retrieval , world wide web , programming language , engineering , mechanical engineering , philosophy