EFFICIENT PREPROCESSING FOR WEB LOG COMPRESSION | Zendy

Sebastian Deorowicz | Zendy; Szymon Grabowski | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

EFFICIENT PREPROCESSING FOR WEB LOG COMPRESSION

Author(s) -

Sebastian Deorowicz,

Szymon Grabowski

Publication year - 2014

Publication title -

computing

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.184

H-Index - 11

eISSN - 2312-5381

pISSN - 1727-6209

DOI - 10.47839/ijc.7.1.487

Subject(s) - web log analysis software , computer science , lossless compression , byte , data compression , transaction log , preprocessor , compression ratio , timestamp , upload , prefix , database , web server , data mining , operating system , the internet , algorithm , real time computing , artificial intelligence , web api , database transaction , automotive engineering , engineering , internal combustion engine , linguistics , philosophy

Web log files, storing user activity on a server, may grow at the pace of hundreds of megabytes a day, or even more, on popular sites. They are usually archived, as it enables further analysis, e.g., for detecting attacks or other server abuse patterns. In this work we present a specialized lossless Apache web log preprocessor and test it with combination of several popular general-purpose compressors. Our method works on individual fields of log data (each storing such information like the client’s IP, date/time, requested file or query, download size in bytes, etc.), and utilizes such compression techniques like finding and extracting common prefixes and suffixes, dictionary-based phrase sequence substitution, move-to-front coding, and more. The test results show the proposed transform improves the average compression ratios 2.70 times in case of gzip and 1.86 times in case of bzip2.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore