z-logo
open-access-imgOpen Access
Trustworthiness Evaluation of Large Language Models Using Multi-Criteria Decision Making
Author(s) -
Meltem Aksoy,
Aylin Adem,
Metin Dagdeviren
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3612568
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
As Large language models (LLMs) become increasingly integrated into high-stakes applications, ensuring their trustworthiness has emerged as a critical research concern. This study proposes a novel evaluation framework that applies a multi-criteria decision making (MCDM) methodology, specifically the hesitant fuzzy analytic hierarchy process (AHP), to assess and rank LLMs based on five key trust dimensions: fairness, robustness, integrity, explainability, and safety. Drawing from expert evaluations, the framework systematically determines the relative importance of each criterion and applies a weighted scoring approach to compare seven leading LLMs, including both proprietary models such as GPT-3.5, GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 and open-source models such as Llama 3.1, Mistral Large 2 and DeepSeek V3. Results reveal GPT-4o as the most trustworthy model, significantly outperforming its peers, particularly in robustness and fairness. Open-source models showed lower scores, especially in safety and explainability, highlighting persistent gaps in their alignment with trust expectations. The findings demonstrate the effectiveness of MCDM in capturing expert uncertainty and prioritizing trust criteria, offering a robust and adaptable framework for evaluating LLMs in dynamic and sensitive domains.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom