130 likes | 159 Views
This research paper explores the relationship between user reviews and the security and privacy of Android applications. By analyzing app reviews and permissions, the study uncovers insights into the impact of user feedback on app security.
E N D
Short Text, Large Effect: Measuring the Impact of User Reviews on Android App Security & Privacy Duc Cuong Nguyen Michael Backes Erik Derr Sven Bugiel CISPA, Saarland University CISPA Helmholtz Center i.G.
Introduction • What are Application Markets? • It is a place where developers distribute their apps and users find and install those applications. • How Application Markets help App Developers and Users? • AM are act as a communication channel between the users and app developers. • What are Reviews ? • Short text messages – opinion about the product – deciding factor for others – feedback – requesting features – reporting bugs. • Issues studied in the paper. • Goals.
Related work • Using Natural Language Processing: • Chabada, Whyper, and AutoCog • Work on app descriptions written by developer • Processing App Reviews: • ChangeAdvisor – classifies useful app reviews for maintenance • Wiscom – studies patterns and determines why user’s dislike the application • App Security Evolution • With the evolution of an application there is request for increasing number of requested permissions violating the least privilege policy. • This Study deals with the connection between user reviews and applications security and privacy
Methodology contd. • App and Review Crawler ( Mining User Reviews ) • Most popular apps in google play store • At least 50,000,000 Download • 2583 apps with their reviews and app history • Reviews written only in English • Why Google Play Store? • Why only applications with 50M Downloads? Downside of this Constraint on the research? • 2.6 millions apps as of December 2018 • Collecting reviews and app history is a time-consuming process. So to restrict they set a threshold for 50M downloads. • Very few applications to study on and mostly all belong to the big names.
Methodology contd. • App and Review Crawler (Crawling App History) • Created an app repository to store apps version history • Used an undocumented API to query Google Play Store • Collected 62838 distinct app version ( 24 version per App) • Issues faced in this approach • API used package name and version code for querying. 82.3% mapping possible. (2162/2583) • Release date were not queried which makes it difficult to map old reviews to older versions. Used market analysis companies for this and were able to map 81.52% of all app versions. (51225/62838)
Methodology contd. • Review Classifier • Used keywords to identify the reviews • The neighboring sentences are also used to expand the classifier’s knowledge • They classified the reviews which mentioned about android permissions or resources • They picked some keywords manually from SPR’s and permissions mentioned in android documentation and formulated the list • Could They have used a better list?
Methodology contd. • Training Data • With the keyword list they could get 1.85M potential SPR’s • Choose 4000 random reviews to manually label them • Ended up with 3891 reviews (SPR: 586, non-SPR: 3,305). • Features Extraction • N-grams of characters : n= 3,4,5 • Machine Learning Model • Bag of Words • Each occurrence of a word is treated as a feature for raining classifiers • Validation • K-fold cross validation (k=10) • AUC (Area Under Curve) because of its non sensitivity to imbalanced classes. Mean AUC value of 0.93 accuracy.
Methodology contd. • Static App Analysis • In this step the app is analyzed for the permissions it requires vs its list of declared permissions list. • It further analyzes it implements the runtime permissions. • It scans apps for dangerous app permissions. • Attributions analysis • Checks if the permissions requested are used by the application or by any other third-party library.
Methodology Contd. • Mapping SPR to SPU • Identifying Candidate App Versions • SPR to SPU Version Distance
Findings Words mentioned in SPR Sample Responses by Developers
Findings Contd. • Identified 5527 SPR from 4.5M Reviews for 1,269 Apps • 2898 SPR could be mapped to Permissions (Relative Words) • 5994 SPU were identified following an SPR • 60.8% of successful mapping • Developers Response Rate to SPR being 75.68%