Premium
Blind test evaluation of consistency in macroscopic lithic raw material sorting
Author(s) -
Agam Aviad,
Wilson Lucy
Publication year - 2018
Publication title -
geoarchaeology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.696
H-Index - 44
eISSN - 1520-6548
pISSN - 0883-6353
DOI - 10.1002/gea.21720
Subject(s) - consistency (knowledge bases) , sorting , reliability (semiconductor) , computer science , classification scheme , process (computing) , set (abstract data type) , test (biology) , calibration , strengths and weaknesses , archaeology , artificial intelligence , statistics , geology , machine learning , mathematics , algorithm , psychology , geography , paleontology , physics , social psychology , power (physics) , quantum mechanics , programming language , operating system
Most archaeological lithic raw material studies depend upon a macroscopic classification. However, since the human eye is a limited tool, some inconsistencies in classification may arise. Thus, a process for evaluating and increasing the reliability of macroscopic classification is needed. We present the results of a blind test designed to evaluate consistency in macroscopic lithic materials analysis, based on archaeological material taken from the Acheulo‐Yabrudian site Qesem Cave (Israel), focusing on interobserver error, aimed at identifying consistencies and weaknesses within our own study scheme. Twelve students, with various degrees of experience and familiarity with the Qesem material, sorted 100 randomly selected flint pieces into flint types, based on a previously established database, after a brief tutorial process. In addition, the authors, LW and AA, performed the same test. We then compared the results, using LW's results as an anchor. Our results show that experience affects the consistency in classification, demonstrating that it is an acquired skill. Furthermore, the blind test allowed us to identify weaknesses within the classification scheme. We suggest that blind tests should be regularly used to check accuracy and reproducibility of results and to assess the definitions set by the analyst, allowing fine‐tuning and calibration of the classification process.