z-logo
open-access-imgOpen Access
Multi-threaded checksum computation for ATLAS high-performance storage software
Author(s) -
Fabrice Le Goff,
G. Avolio
Publication year - 2020
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1525/1/012026
Subject(s) - checksum , computer science , computer data storage , embedded system , software , atlas experiment , server , large hadron collider , computer hardware , reliability engineering , real time computing , operating system , detector , engineering , telecommunications , physics , quantum mechanics
ATLAS is one of the general purpose experiments observing hadron collisions at the LHC at CERN. Its trigger and data acquisition system (TDAQ) is responsible for selecting and transporting interesting physics events from the detector to permanent storage where the data are used for further processing. The transient storage of ATLAS TDAQ is the last component of the online system in the data flow. It records selected events at several GB/s to non-volatile storage before transfer to offline permanent storage. The transient storage is a distributed system consisting of high-performance direct-attached storage servers accounting for 480 hard drives. A distributed multi-threaded C++ application operates the hardware. The transient storage is also responsible for computing a checksum for the data, which is used to ensure data integrity of the transferred data. Reliability and efficiency of this system are critical for the operations of TDAQ as well. This paper presents the existing multi-threading strategy of the software and how the available hardware resources are used. We then introduce how multi-threaded checksum computation was introduced to increase significantly the maximum throughput of the system. We discuss the key concepts of the implementation with a focus on the importance of overhead minimization. Finally the paper reports on the tests done on the production system to demonstrate the validity of the implementation and measurements of the performance improvement in the view of future LHC and ATLAS upgrades.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here