Presented by Yichi Zhang 2019-4-4

ESEC/FSE 2018, CCF-A Presented by Yichi Zhang 2019-4-4

Experiment Report or Conference Paper • Idea: Incremental, but pragmatic • Result outweighs Implementation: A Systematic Comparison of Existing Tools Output: Take-aways for personal research

Focus on Taint Analysis

Motivation • “What tool is the optimal choice in which application context?” • Challenges: • 1. Congruence in Sink/Source Labelling • 2. Congruence in Output format

Figure 1: Inconsistence outputs

Motivation • “What tool is the optimal choice in which application context?” • Challenges: • 1. Incongruence in Sink/Source Labelling • 2. Incongruence in Output format • 3. Imprecise “Ground truth”

Which source? Which Sink? What are internal nodes? Sometimes incorrect Figure 2: Imprecise “Ground Truth” in DroidBench and ICC-Bench

Overview

Main Contribution • 1. Anroid App Analysis Query Language (AQL) QUESTION Figure 3: Get all flows in one apk Figure 4: Yes or No Question. Whether there exists a flow

Main Contribution • 1. Anroid App Analysis Query Language (AQL) ANSWER Figure 5: Answer to “FLows IN ... ?”

Main Contribution • 2. AQL System: (From a user perspective) • Input: AQL Question • Output: AQL Answer • Procedure: • 1. Configuring a Tool with an analysis target and runtime parameters • 2. Run the tool • 3. Turn the output from the tool into AQL Answer Figure 6: AQL system

Main Contribution • 3. Benchmark Refinement and Execution Wizard (BREW) (From a user perspective) • Input: .apk file • Output: Ground Truth, i.e. the exactly data leak and the number of leaks • Procedure: • 1. Case Identification • 2. Source/Sink Labeling (Susi, machine learning based) • 3. Automatically Preselect flows, manually deselect by user • 4. Generate Ground truth in AQL Answer • 5. Compare the result from an analysis tool with the ground truth

Main Contribution • 4. Ground truths e.g. 21 newly developed Apps with where 18 apps providing 18 positive benchmark cases, and 6 negative cases, 3 apps dedicated to ICC/IAC feature. & 22 precise positive benchmark cases on DialDroid which encompasses 30 large real-world apps.

Analysis Tools in the study Figure 7: Tools involved in the Study

Result: 1. Do Android App analysis tool keep their promises*? Figure 8: Result of Supported Feature *Promises: Supported Feature and Accuracy

Result: 1. Do Android App analysis tool keep their promises? Figure 9(a): Result of F-score on Different benchmark suites

Result: 1. Do Android App analysis tool keep their promises? Figure 9(b): Result of F-score on Different benchmark suites

Result: 2. How do the tools copare to each other with respect to accuracy? Figure 10: Result of F-score in different features on DroidBench 3.0 On average, FlowDroid and Amandroid win.

Result: 3. Which tools support large-scale analyses of real-world apps? Figure 11: None can successfully finish all 30 apps. DIDFail and FlowDroid Win.

Result: 3. Which tools support large-scale analyses of real-world apps? API 26: Android 8.0 API 19: Android 4.4 Figure 12: Ability to analyze newer apps Because of tool dependency on ApkTool (decompiler), and ApkCombiner (for IAC feature)

Limitation • 1. Using default configuration of analysis tools. • Implication: Before taking away the ground truth, check if this tool support additional parameter to get a better result. • 2. Bugs in AQL system. Because of imprecise format of tool's output, the translation over-approximates. • Implication: Overriding methods with different parameters may be treated as the same.

Discussion Ideas: 1. This tool facilitates the analysis with different analysis tools 2. The precisely defined ground truth can be used for further research 3. One can use BREW to generate new ground truth and comparing the performance between analysis tools, e.g. After altering the code, whether the analysis tool still finds the flow correctly. Bounced off Ideas: ........

Benchmark case • Component: An App, or combination of Apps • Positive case: the flow is expected to be detected • Negative case: the flow is not expected to be detected • Success: True positive and true negative • Failure: False positive and false negative

Presented by Yichi Zhang 2019-4-4

Presented by Yichi Zhang 2019-4-4

Presentation Transcript

Presented by Zhang Qunxing

Bullying Programs Presented by: Clara Mills 4

Presented by Group 4

Presented by LG 4

Presented by: Peng Zhang 4/15/2011

Presented by Archana vijayalakshmanan 4/11/2006

Presented by Yuhua Jiao 2012-12-4

Presented By: Manish Singh B.Arch 4 061016

Presented by Y. Zhang Nov. 18, 2012

March 4, 2009 Presented by Jerry Lynch

Presented by: Dong Si 4-6-2011

4 June 2012 (Presented by Wellies Welgemoed)

Presented by Zhang Qunxing