Typestates and Beyond: Verifying Rich Behavioral Properties Over Complex Programs
Abstract
Statically verifying behavioral properties of programs is an important research problem. An
efficient solution to this problem will have visible effects over multiple domains, ranging from
program development, program debugging, program correction and verification, etc. Type
systems are the most prevalently used static, light-weight verification systems for verifying
certain properties of programs. Unfortunately, simple types are inadequate at verifying many
behavioral/dynamic properties of programs. Typestates can tame this inadequacy of simple
types by associating each type in a programming language with a state information. However,
there are two major challenges in statically analyzing and verifying typestate properties over
programs.
The first challenge may be attributed to \increasing complexity of programs". The original
work on typestates can only verify/analyze a typestate property over very simple programs
which lacked dynamic memory allocation or aliasing. Subsequently, the following works on
typestates extended and improvised the analysis over programs with aliasing and heaps. However,
the state-of-the-art static typestate analysis works still cannot handle formidably rich
programming features like asynchrony, library calls and callbacks, concurrency, etc. The second
challenge may be attributed to \complexity of the property being verified". The original
and the current notion of typestates can only verify a property de nable through a finite-state
abstraction. This makes the state-of-the-art typestate analysis and verification works inadequate
to verify useful but richer non-regular program properties. For example, using classical
typestates we can verify a property like, \pop be called on a stack only after a push operation",
but we cannot verify a non-regular program property like, \number of push operations should
be at least equal to the number of pop operations". Currently, these behavioral properties are
mostly verified/enforced by programmers at runtime via explicit checks. Unfortunately, these
runtime checks are costly, error-prone, and lay an extra burden on the programmer.
In this thesis we take small steps towards tackling both these challenges. Addressing complex
program features, we present an asynchrony-aware static analysis, taking Android applications
as our use case. Android applications have convoluted control
flow, and complex features like asynchronous inter-component communications, library callbacks, Android enforced control flows (called as lifecycles), resource XMLs, which de ne and register event-handlers etc. Unfortunately,
none of the available static analysis works for Android soundly captures all these
features. We provide a formal semantics for Android asynchronous control
flow, capturing
these features. We use this semantics to introduce an intermediate program representation for
Android applications, called the Android Inter-Component Control Flow Graph (AICCFG),
and develop an asynchrony-aware interprocedural static analysis framework for Android applications.
We use this framework to develop a static typestate analysis to capture Android
resource API usage protocol violations. We present a set of benchmark applications for different
resource types, and empirically compare our typestate analysis with the state-of-the-art
synchronous static analyses for Android applications.
Addressing the challenges associated with increasing complexity of properties, we present an
expressive notion of typestates called, Parameterized typestates (p-typestates). p-typestates,
associate an extra Pressburger de nable property along with states of regular typestates. This
allows p-typestates to express many useful non-regular properties. We formally de ne this
notion of p-typestates, and a p-typestate property automaton, to represent a p-typestate property.
We present a dependent type system for these parameterized typestates and present
a simple typestate-oriented language incorporating p-typestates. Further, typechecking such
rich p-typestate properties require a programmer to provide invariants for loops and recursive
structures. Unfortunately, providing these invariants is a non-trivial task even for expert programmers.
To solve this problem, we present a simple and novel loop invariant calculation
approach for Pressburger de nable systems. We encode a number of real-world programs in
our dependently-typed language and use p-typestates type system, and loop-invariant calculation
to verify several rich properties which are not amenable to regular typestate analysis and
verification.
Finally we discuss several possible extensions of our thesis along both these directions