z-logo
Premium
Semantics to the rescue of document‐based XML diff: A JATS case study
Author(s) -
Cuculovic Milos,
Fondement Frederic,
Devanne Maxime,
Weber Jonathan,
Hassenforder Michel
Publication year - 2022
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/spe.3074
Subject(s) - computer science , xml validation , document structure description , efficient xml interchange , xml schema editor , information retrieval , streaming xml , xml , xml encryption , xml schema (w3c) , xml database , world wide web
The writing of digital text documents has become a longer process that usually goes through revision rounds. Document comparison is important for the human reader interested in changes made by the authors. These documents contain structural data using text‐centric XML as one of their main storage systems. Current XML diff algorithms are able to represent differences with a limited number of edit operations: insert, delete, move and update. This approach does not fit the scope of digital text document comparison where the human reader needs to understand actual modifications made by the author. With JATS being a text‐centric XML vocabulary, we propose within this paper a new XML diff algorithm called jats‐diff, able to support bijection between higher‐level modifications made by the authors, such as structural changes and restyling, and the changes detected between XML documents. In addition, jats‐diff provides similarity information between different nodes in order to measure the impact of the text changes on the XML tree.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here