Tests from traces: automated unit test extraction for R
Author(s) -
Filip Křikava,
Jan Vítek
Publication year - 2018
Publication title -
hal (le centre pour la communication scientifique directe)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1145/3213846.3213863
Subject(s) - computer science , unit testing , dynamism , source lines of code , code (set theory) , test (biology) , software , code coverage , programming language , test case , test data , extreme programming , machine learning , software development , software development process , paleontology , physics , regression analysis , set (abstract data type) , quantum mechanics , biology
Unit tests are labor-intensive to write and maintain. This paper looks into how well unit tests for a target software package can be extracted from the execution traces of client code. Our objective is to reduce the effort involved in creating test suites while minimizing the number and size of individual tests, and maximizing coverage. To evaluate the viability of our approach, we select a challenging target for automated test extraction, namely R, a programming language that is popular for data science applications. The challenges presented by R are its extreme dynamism, coerciveness, and lack of types. This combination decrease the efficacy of traditional test extraction techniques. We present Genthat, a tool developed over the last couple of years to non-invasively record execution traces of R programs and extract unit tests from those traces. We have carried out an evaluation on 1,545 packages comprising 1.7M lines of R code. The tests extracted by Genthat improved code coverage from the original rather low value of 267,496 lines to 700,918 lines. The running time of the generated tests is 1.9 times faster than the code they came from
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom