
Investigating vulnerability datasets
Author(s) -
Rodrigo Andrade,
Vinícius Souza dos Santos
Publication year - 2021
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5753/vem.2021.17213
Subject(s) - commit , computer science , vulnerability (computing) , secure coding , data science , software , work (physics) , computer security , world wide web , software security assurance , information security , database , engineering , security service , programming language , mechanical engineering
Insecure software can cause severe damage to user experience and privacy. Therefore, developers should be able to prevent software vulnerabilities. However, detecting such problems is expensive and time consuming. To mitigate this issue, researchers propose vulnerability datasets to make it easier to investigate its properties. In this work, we investigate one dataset to better understand common vulnerabilities, the authors who introduce them to open-source projects, and commit properties. Thus, we use as case study the Big-Vul dataset to help us answering the six research questions we define for this work. Our preliminary results indicate that the most common vulnerabilities occur in the Chromium project. Furthermore, mostly experienced authors are responsible for introducing these vulnerabilities. Last but not least, we conclude that such findings could help developers on detecting vulnerabilities.