Premium
Scanning HTML at Tens of Gigabytes Per Second on ARM Processors
Author(s) -
Lemire Daniel
Publication year - 2025
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/spe.3420
Subject(s) - computer science , operating system
ABSTRACT Background Modern processors feature Single Instruction, Multiple Data (SIMD) instructions capable of processing 16 bytes or more simultaneously, enabling significant performance enhancements in data‐intensive tasks. Two major Web browser engines (WebKit and Blink) have adopted SIMD algorithms for parsing HTML. Objective This study reviews recent advances in utilizing SIMD instructions to accelerate HTML parsing through vectorized classification techniques. Methods We compare these HTML parsing techniques with a faster alternative. Performance is benchmarked against traditional methods on recent ARM processors. Results Our measurements demonstrate a 20‐fold performance improvement in HTML scanning using SIMD‐based approaches compared to conventional parsing methods on modern ARM architectures. Conclusion These findings underscore the transformative potential of SIMD‐based algorithms in optimizing Web browser performance, offering substantial speedups for processing Internet formats and HTML parsing.