Automated Test Generation and Performance Improvement using Dynamic Program Analysis
Abstract
Software development process consists of various stages like design, implementation, and
testing. Since programmers are considerably involved in these stages, their intuition and
expertise play a vital role in the success of software deployment. Therefore, various tools
have been developed that assist programmers in creating, debugging and managing their
programs. Program analysis, the process of analyzing program behavior by examining its
source code or execution, forms the basis of such software tools. In this thesis, we identify
and address following three research gaps in the area of program analysis.
Type-awareness in concolic testing
Software testing is a widely used dynamic program analysis technique. It helps to determine
the program correctness by executing it to identify undesired behaviors. Since manually writing
effective test suite is challenging, various automated test generation approaches have
been proposed. Concolic testing is one such approach that outputs a test suite for a given
program with the intent of maximizing code coverage. This technique is shown to be very
effective for statically typed languages. With the modernization of web programming, dynamically
typed JavaScript has become a popular language. We study the extension of concolic
testing for JavaScript programs. We show that, for JavaScript programs, such approach generates
a large number of test cases that explore paths and types of the variables repeatedly.
Due to the resultant sizable test suite, adaption of such technique becomes impractical.
In our first work, we propose an effective approach that scales concolic testing by incorporating
type-awareness in conventional concolic testing. Our experiments demonstrate that,
for a significant percentage of the functions, type-aware concolic testing generates a small
percentage (less than 5%) of the test cases as compared to conventional concolic testing. On
average, this approach achieves over 97% of line coverage and over 94% of branch coverage
for all the functions across all benchmarks. Our experimental results show that adapting
type-aware concolic testing is as effective as conventional concolic testing in detecting bugs
and achieving coverage, even with a small percentage of test cases.
Test synthesis to detect redundant traversal bugs
Test generation techniques like concolic testing focus on verifying the functional correctness
of the program. Producing correct results is not the only requirement of the program.
These results should also be achieved within budgeted time. Efficiency, another important
aspect of the program, can be validated with performance testing. Our second work focuses
on automated performance testing in the context of redundant traversal bugs. Such
bugs occur when program fragment performs repetitive and same computations across
loop iterations. Various static and dynamic analysis techniques have been proposed to detect
such bugs automatically. However, the effectiveness of dynamic analyses is dependent
on input test cases, and static analyses are less useful in validating the presence of this
problem.
We propose an approach to generate tests for exposing redundant traversal of loops automatically.
Our experiments reveal the effectiveness of the proposed approach in generating
224 tests that reveal 46 bugs across seven libraries, including 34 previously unknown
bugs. The tests generated using our approach significantly outperform the randomly generated
tests in their ability to expose the inefficiencies, demonstrating the usefulness of
our design.
Performance hints for stable multithreaded systems
Assuring correctness and efficiency of the sequential programs is easier as compared to
multithreaded programs. This is mainly because the execution of multithreaded programs
depends not only on the input but also on thread interleavings. Stable multithreaded
systems restrict the set of possible schedules such that schedule remains unaffected for the
given input, even with minor changes. This eases the process of bug reproduction and
thereby guaranteeing correctness. However, the determinism imposed in these systems
can potentially serialize the schedule resulting in performance degradation. Our third
work focuses on automatically deriving performance hints so that such serialization can
be avoided. Our experimental results demonstrate that we are able to reduce the overall
execution time of programs by up to 34% when compared to the execution time where
barriers are inserted manually. Moreover, we observe a performance improvement ranging
from 38% to 88% as compared to programs without barriers. Our experimental results
show that adapting PEGASUS to infer barriers for multiple versions of a source program is
seamless.
In this thesis, we motivate the problems mentioned above and discuss proposed solutions
in detail. We implement above ideas, develop research prototypes, and study them
on well known real-world programs. The ease in using these prototypes makes our proposed
approach a viable contribution in designing techniques to develop software tools.
All the tools that are developed as part of this thesis have been released publicly