330 likes | 437 Views
Randomizing Smartphone Malware Profiles against Statistical Mining Techniques. Abhijith Shastry Murat Kantarcioglu Yan Zhou Bhavani Thuraisingham. Outline. Introduction Related Work Smartphone Malware Development Attacking Statistical Data Mining Tools Experimental Results Conclusions.
E N D
Randomizing Smartphone Malware Profiles against Statistical Mining Techniques Abhijith Shastry Murat Kantarcioglu Yan Zhou Bhavani Thuraisingham
Outline • Introduction • Related Work • Smartphone Malware Development • Attacking Statistical Data Mining Tools • Experimental Results • Conclusions
Introduction • Smartphone usage: • Calling • Web browsing • Social networking • Online banking
Introduction (cont.) • Smartphone OS: • iOS • Android • Symbian • Microsoft windows phone OS
Introduction (cont.) • Common malicious activities on smartphones • Collecting user information (61%) • Sending premium-rate SMS messages (52%) • Amusement • Credential theft • SMS spam • Search engine optimization • Ransom
Introduction (cont.) • Smartphone Defense Mechanisims • Application permission (Android) • Review (iOS) • Review and sign (Symbian)
Introduction (cont.) • Smartphone application market • Apple app store • Applications are reviewed by Apple for security • Disallow personal spyware • Android market • Users can install applications from unofficial markets • Google does not review applications prior their listing • Google can remotely uninstall known malware from users’ devices • Nokia Ovi • Symbian does not prevent or discourage users from installing applications from other sources. • Symbian provides application signing services; only signed applications have permission to access dangerous privileges
Introduction (cont.) • User-approved permissions • Android: user-approved install-time permissions control access to the phone’s number, contact lists, camera, blue tooth, etc. • iOS: requires user approval to access location and send notifications at runtime rather than installation time • Symbian: user permission is needed to install unsigned apps.
Introduction (cont.) • Smartphone malware incentives • Amusement • Selling user informaiton • Applications query mobile APIs for user’s location, lists of contacts, browser and download history, list of installed apps, IMEI (unique device id), inter-application data theft • Steal user credentials • Bank account credentials ($1000), credit card number ($25), e-mail account password ($30) • Intercepting SMS messages, phishing, keylogging and document scanning • Premium rate calls and SMS • Android and Symbian, premium SMS can be sent completely unnoticed • SMS spam, search engine optimization, blackmail
Introduction (cont.) • Root exploit • Available 74% lifetime of a smartphone • Become available on average 5.2 days after a firmware version • Prior to or on the official release date
Introduction (cont.) • Incentives for root exploits • Restricted app distributions • Users cannot perform complete system backup • Restrictions on sharing internet connetions with computers • Users cannot remove pre-installed apps • Users cannot install custom versions of the OS
Introduction (cont.) • Smartphone malware activities • eavesdropping on phone calls • reading e-mail and call-logs • tracking callers’ locations • Malware detection techniques • Static analysis • Dynamic analysis
Related Work • Permission-based behavioral foot-printing (Zhou et al.) • Behavior checker (Yap and Ewe) • Power signature (Kim et al.) • Association rules and frequent episodes (Lee and Stolfo)
Smartphone Malware Development • Android platform (Samsung Captivate) • We assume that either through a direct installation or an indirect installation (through the payload of a benign application), the victim’s mobile phone is infected with the developed malware • Six parameterized malware programs • Privacy intrusion • Information theft • Denial-of-service
Smartphone Malware Development (cont.) • Call recorder • Eavesdropping on incoming and outgoing calls • Recorded files stored locally or uploaded to a remote server • Parameters: max_duration, max_filesize, num_skipped_calls, interval_record, interval_sleep, _upload, delete_local
Smartphone Malware Development (cont.) • Smart recorder • Eavesdropping on specific phone numbers on both incoming and outgoing calls • Specific phone numbers can be changed at runtime • Phone numbers are stored on a remote server and the recorded phone calls are uploaded to the server • Parameters: max_duration, max_filesize, num_skipped_calls, interval_record, interval_sleep
Smartphone Malware Development (cont.) • Mass Uploader • Upload contents of memory to the server • Also download contents from a remote server • Parameters: upload/download_BW, upload/download_interval, upload/download_inter_lim
Smartphone Malware Development (cont.) • DoS • Spawn a large number of threads, each performing expensive multiplications • Freeze smartphone with 200 threads • Parameters: max_threads, num_multiplications, interval_restart, interval_sleep
Smartphone Malware Development (cont.) • Spy recorder • Remotely turn on the microphone and start recording any voice input • Initiated and terminated by a phone call from a specific number • Microphone is turned on when a call comes from the number and the call is automatically rejected afterwards • User is not notified • Parameters: max_duration, max_filesize, inverval_record, interval_sleep, _upload, delete_local
Smartphone Malware Development (cont.) • Spy camera • Taking snap shots from the camera • Pictures uploaded to a remote server • User is not notified when pictures are taken • Parameters: snap_interval, pic_dsample_ratio, pic_comp_quality, _upload, delete_local
Attacking Statistical Data Mining Tools • Data mining tools • Decision Tree • Logistic Regression • Naïve Bayes • Artificial neural network • Support vector machine • Common assumption • training data and test data are identically independently distributed (iid).
Attacking Statistical Data Mining Tools (cont.) • Randomizing malware profiles • Malware parameters are randomized with an expected mean value. • Each randomized malware profile varies from the other with respect to the amount of deviation from the mean value of a parameter. • For example, Randx refers to a profile in which the variance from the mean value of the parameters is x. • Randomized profiles violate the iid assumption.
Experimental Results • Data collection • Recording run-time behavior of applications • Features including CPU consumption, network traffic, memory usage, power, etc. • Features are recorded every five seconds and stored in a database on the mobile phone. • Data sets • 20 benign apps • 10 tools, 10 games • 6 malware programs plus their randomized profiles
Experimental Results (cont.) • Evaluation metrics • True positive rate • False positive rate • Accuracy • Area Under Curve (AUC)
Experimental Results (cont.) • Experiment I: no randomized profiles • 20 benign programs, 6 malware programs • 10-fold cross-validation
Experimental Results (cont.) • Experiment I (cont.): no randomized profiles • Training on games (right) has higher accuracy than training on tools(left)
Experimental Results (cont.) • Experiment II: with randomized profiles
Experimental Results (cont.) • Experiment II: with randomized profiles
Experimental Results (cont.) • Experiment II: with randomized profiles
Experimental Results (cont.) • Experiment II: with randomized profiles
Experimental Results (cont.) • Experiment II: with randomized profiles
Experimental Results (cont.) • Experiment II: observations • No consistent winner • When training set does not contain randomized malware profiles, accuracy is limited to below 70%. • When training set contains Randx and test set contains Randy, good accuracy is obtained only when x ≈ y. • Averaging samples in 25-sec durations helps improve predictive accuracy. Longer durations do not result in further improvement.
Conclusions • We developed six custom parameterized malware programs on the Android platform. • We demonstrate that data mining algorithms are vulnerable to attacks with randomized malware profiles. • We also demonstrate that simple consolidation may effectively improve classification performance.