
Standing on the Feet of Giants — Reproducibility in AI
Author(s) -
Gundersen Odd Erik
Publication year - 2019
Publication title -
ai magazine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.597
H-Index - 79
eISSN - 2371-9621
pISSN - 0738-4602
DOI - 10.1609/aimag.v40i4.5185
Subject(s) - documentation , reproducibility , quality (philosophy) , set (abstract data type) , psychology , artificial intelligence , computer science , statistics , mathematics , epistemology , programming language , philosophy
A recent study implies that research presented at top artificial intelligence conferences is not documented well enough for the research to be reproduced. My objective was to investigate whether the quality of the documentation is the same for industry and academic research or if differences actually exist. My hypothesis is that industry and academic research presented at top artificial intelligence conferences is equally well documented. A total of 325 International Joint Conferences on Artificial Intelligence and Association for the Advancement of Artificial Intelligence research papers reporting empirical studies have been surveyed. Of these, 268 were conducted by academia, 47 were collaborations, and 10 were conducted by the industry. A set of 16 variables, which specifies how well the research is documented, was reviewed for each paper and each variable was analyzed individually. Three reproducibility metrics were used for assessing the documentation quality of each paper. The findings indicate that academic research does score higher than industry and collaborations on all three reproducibility metrics. Academic research also scores highest on 15 out of the 16 surveyed variables. The result is statistically significant for 3 out of the 16 variables, but none of the reproducibility metrics. The conclusion is that the results are not statistically significant, but still indicate that my hypothesis probably should be refuted. This is surprising, as the conferences use double‐blind peer review and all research is judged according to the same standards.