Investigating Data Consistency in the ASHRAE Dataset Using Clustering and Label Matching
Author(s) -
Hui-Hui Tan,
Yi-Fei Tan,
Wooi-Haw Tan,
Chee-Pun Ooi
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3615311
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Data is a critical component in various fields, enabling researchers to perform analyses, improve decision-making, optimization, and scientific research. However, poor data quality can lead to flawed decisions and inefficiencies. In evaluating data quality, four primary aspects are typically considered: accuracy, completeness, consistency, and timeliness. The existence of human preferences in the dataset may present issues with data inconsistency. One of the popular datasets in the thermal comfort area is the ASHRAE Comfort Database II, which contains human preference features that may cause data inconsistency and affect the performance of machine learning models. In this paper, Clustering and Label Matching is proposed to clean the data and achieve consistency. Outlier detection techniques are employed as reference methods to evaluate how well different approaches can improve dataset consistency, with performance assessed using cluster analysis and validated through prediction model evaluation. The results show that the proposed method is capable of performing deep cleaning to enhance dataset consistency.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom