z-logo
open-access-imgOpen Access
The Page Image: Towards a Visual History of Digital Documents
Author(s) -
Andrew Piper,
Chad Wellmon,
Mohamed Cheriet
Publication year - 2020
Publication title -
book history
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.102
0
eISSN - 1529-1499
pISSN - 1098-7371
DOI - 10.1353/bh.2020.0010
Subject(s) - history , computer science
France, to convene the first ever conference on “document analysis and recognition.”1 The meeting brought together researchers from all over the world who for roughly the previous decade had been slowly changing the paradigm through which they approached the problem of the machinic understanding of the digitized page. Instead of thinking in terms of “characters” and “recognition,” which underlay the long-standing field of Optical Character Recognition (OCR), they were gradually moving towards a more global and formal understanding of the page image as a whole. Researchers in the field of Document Image Analysis, or DIA as it came to be known, discarded the common assumption that the letter or the text was the ultimate referent of the bibliographic page. They focused instead on the heterogenous visual qualities of the page, or what they termed “the page image.” “Document image analysis,” writes George Nagy in a survey of twenty years of research in the field, is the “theory and practice of recovering the symbol structure of digital images scanned from paper or produced by computer.”2 DIA researchers turned the page image into an analytical object. In moving away from a text-centric understanding of the page, research in Document Image Analysis offers an important new way of thinking about the bibliographic page that is different from what has traditionally been the case in computational approaches to studying culture, but that has deep roots in the fields of book history, bibliography, and textual studies. Whether in the guise of “natural language processing” (NLP), “optical character recognition” (OCR), or “text mining,” computational approaches to pages have remained heavily influenced by a text-centric mentality, using the page image as an (often imperfect) means to an end, an object to be passed through rather than studied as something potentially meaningful in itself. At the same time, the fast-growing field of “image analytics,” which ranges from facial detection to the analysis of newspaper illustrations, has largely maintained the text-image divide that has long dominated the study The Page Image Towards a Visual History of Digital Documents

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom