Premium
Word‐based text compression
Author(s) -
Moffat Alistair
Publication year - 1989
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/spe.4380190207
Subject(s) - computer science , data compression , word (group theory) , lossless compression , implementation , coding (social sciences) , compression (physics) , character (mathematics) , arithmetic coding , natural language processing , algorithm , speech recognition , theoretical computer science , artificial intelligence , programming language , context adaptive binary arithmetic coding , mathematics , geometry , composite material , statistics , materials science
The development of efficient algorithms to support arithmetic coding has meant that powerful models of text can now be used for data compression. Here the implementation of models based on recognizing and recording words is considered. Move‐to‐the‐front and several variable‐order Markov models have been tested with a number of different data structures, and first the decisions that went into the implementations are discussed and then experimental results are given that show English text being represented in under 2‐2 bits per character. Moreover the programs run at speeds comparable to other compression techniques, and are suited for practical use.