Research Library

open-access-imgOpen AccessAdvanced Unstructured Data Processing for ESG Reports: A Methodology for Structured Transformation and Enhanced Analysis
Jiahui Peng,
Jing Gao,
Xin Tong,
Jing Guo,
Hang Yang,
Jianchuan Qi,
Ruiqiao Li,
Nan Li,
Ming Xu
Publication year2024
In the evolving field of corporate sustainability, analyzing unstructuredEnvironmental, Social, and Governance (ESG) reports is a complex challenge dueto their varied formats and intricate content. This study introduces aninnovative methodology utilizing the "Unstructured Core Library", specificallytailored to address these challenges by transforming ESG reports intostructured, analyzable formats. Our approach significantly advances theexisting research by offering high-precision text cleaning, adeptidentification and extraction of text from images, and standardization oftables within these reports. Emphasizing its capability to handle diverse datatypes, including text, images, and tables, the method adeptly manages thenuances of differing page layouts and report styles across industries. Thisresearch marks a substantial contribution to the fields of industrial ecologyand corporate sustainability assessment, paving the way for the application ofadvanced NLP technologies and large language models in the analysis ofcorporate governance and sustainability. Our code is available at

Seeing content that should not be on Zendy? Contact us.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here