A Fine-Grained Dynamic Information Flow Analysis for Android Apps
Android has been steadily gaining popularity ever since its launch in 2008. One of the major factors for this is the easy availability of a large variety of apps. They range from simple apps such as calculator apps to apps which can help people maintain their schedules and thus man-age many aspects of their lives. In addition, a lot of free apps are available to the user thanks to the power of in-app purchases and advertisements. However, these also raise many security concerns. Apps are privy to a lot of private information regarding the user, such as his contacts, location, etc. It is essential to ascertain that apps do not leak such information to untrustworthy entities. In order to solve this problem, there have been many static and dynamic analyses which aim to track private data accessed or generated by the app to its destination. Such analyses are commonly known as Information Flow analyses. Dynamic analysis techniques, such as TaintDroid, tracks private information and alerts the user when it is accessed by speci c API calls. However, they do not track the path taken by the information, which can be useful in debugging and validation scenarios. The first key contribution of this thesis is a model to perform dynamic information ow analysis, inspired by FlowDroid and TaintDroid, which can retain path information of sensitive data in an efficient manner. The model instruments the app and uses path-edges to track the information flows during a dynamic run. We describe the data structure and transfer functions used, and the reasons for its design based on the challenges posed by the Android programming model and efficiency requirements. The second key contribution is the capability to trace the path taken by the sensitive information based on the information obtained during the analysis, as well as the capability to compliment static analyses such as FlowDroid with the output of this analysis. The tests conducted on the implemented model using DroidBench and GeekBench 3 show the precision and soundness of the analysis, and a performance overhead of 25% while real-world apps show negligible lag. All leaks seen in DroidBench where successfully tracked and were verified to be true positives. We tested the model on 10 real-world apps where we find on average about 16.4% of the total path-edges found by FlowDroid.