Application of a Finite-State Model to Text Compression
Author(s) -
J. Tuehola
Publication year - 1993
Publication title -
the computer journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.319
H-Index - 64
eISSN - 1460-2067
pISSN - 0010-4620
DOI - 10.1093/comjnl/36.7.607
Subject(s) - computer science , finite state machine , compression (physics) , alphabet , data compression , markov chain , algorithm , coding (social sciences) , arithmetic coding , state (computer science) , theoretical computer science , compression ratio , context adaptive binary arithmetic coding , mathematics , machine learning , engineering , statistics , mechanical engineering , linguistics , philosophy , materials science , composite material , internal combustion engine
The bit-oriented finite-state model applied in Dynamic Markov Compression (DMC [5]) is here generalized to a larger alphabet. The finite-state machine is built adaptively during compression, by applying two tyes of modifications to the machine structure: state cloning and shortcut creation. The machine size is kept tolerable by an escape transition mechanism. Similar to DMC, the new method is combined with arithmetic coding, based on the maintained transition frequencies. The experiments show that the new approach produces notably better compression gains for different sorts of texts in natural and formal languages. In some cases the results are better than for any compression technique found in the literature
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom