A First Step Toward Detecting SSH Identity Theft in HPC Cluster Environments: Discriminating Masqueraders Based on Comm

1. �A First Step Toward Detecting SSH Identity Theft in HPC Cluster Environments: Discriminating Masqueraders Based on Command Performance �

2. Motivation �SSH Identity Theft� - an Open Problem user has SSH password or unencrypted keys stolen Masquerade Detection � an Open Problem Detecting when stolen authentication credentials allow an attacker to masquerade as a legitimate user

3. Motivation-2 �SSH Identity Theft� - an Open Problem user has SSH password or unencrypted keys stolen Masquerade Detection � an Open Problem Detecting when stolen authentication credentials allow an attacker to masquerade as a legitimate user How? capturing logins and confirming back with users sample and inspect syslog inspection

4. Overview Process Accounting Experimental Design Results Conclusions


6. Process Accounting A method of recording and summarizing commands executed on systems originally created for billing purposes automation supporting human monitoring


8. Experimental Design The Data Technique: Support Vector Machine (SVM)


10. Results-1 Put TP FP and overall accuracy here first. 229 samples 75 cluster 154 internet Only 6 misidentified Accuracy= precision-= Recall=Put TP FP and overall accuracy here first. 229 samples 75 cluster 154 internet Only 6 misidentified Accuracy= precision-= Recall=

11. Results-2

12. Results-3

13. Results-4 Same scale for both, 1-99, 100-199, 200-299, 300-399, 400-499,�1K-10K,>10K Similar distributions but Total # of commands cluster 1,834,905 (4X Internet); with five(1K-10K), eight(>10K) While Total # of commands Internet 538,000; with 27(1K-10K), 2(>10K) Same scale for both, 1-99, 100-199, 200-299, 300-399, 400-499,�1K-10K,>10K Similar distributions but Total # of commands cluster 1,834,905 (4X Internet); with five(1K-10K), eight(>10K) While Total # of commands Internet 538,000; with 27(1K-10K), 2(>10K)

14. Results-5 Figures 5 and 6 show a visualization of command popularity for HPC clusters and Internet servers respectively. This visualization was first developed in [Schonlau2001] and has appeared in subsequent masquerade detection papers [ ]. Within this visualization, a vertical panel correspond to one user, thus we have 75 and 154 panels for HPC cluster and Internet server users respectively. Within each user panel, each command is represented by a single dot and assigned a distinct vertical position ordinate based on popularity such that higher command popularities have larger Y-axis ordinates. Commands are also plotted versus time within each panel, the same command used multiple times is represented as a horizontal line starting within a panel at the ordinate given at its first entry time. The use of uniqueness metric for individual commands is a useful concept to understand how the SVM machine learning makes user classifications. For masquerade detection, commands not previously seen in a data set may indicate an attempted masquerade. Uniqueness = 1 - (total users of a command/total users)Figures 5 and 6 show a visualization of command popularity for HPC clusters and Internet servers respectively. This visualization was first developed in [Schonlau2001] and has appeared in subsequent masquerade detection papers [ ]. Within this visualization, a vertical panel correspond to one user, thus we have 75 and 154 panels for HPC cluster and Internet server users respectively. Within each user panel, each command is represented by a single dot and assigned a distinct vertical position ordinate based on popularity such that higher command popularities have larger Y-axis ordinates. Commands are also plotted versus time within each panel, the same command used multiple times is represented as a horizontal line starting within a panel at the ordinate given at its first entry time. The use of uniqueness metric for individual commands is a useful concept to understand how the SVM machine learning makes user classifications. For masquerade detection, commands not previously seen in a data set may indicate an attempted masquerade. Uniqueness = 1 - (total users of a command/total users)

15. Results-6 Similar, Cluster 300 unique commands, most unique Internet 100 unique commands, (Internet activity)Similar, Cluster 300 unique commands, most unique Internet 100 unique commands, (Internet activity)

16. Results-7 Distintct commands usage, different scales Internet90 much higher Cluster25 Cluster = 426 different commands (unique cluster commands) tri-modal, (20-29 peak) Internet= 263 different commands (10-19 peak, much higher) command usage more commonDistintct commands usage, different scales Internet90 much higher Cluster25 Cluster = 426 different commands (unique cluster commands) tri-modal, (20-29 peak) Internet= 263 different commands (10-19 peak, much higher) command usage more common

17. Overview Process Accounting Experimental Design Results Conclusions

18. Conclusions empirical results show it is feasible to effectively distinguish cluster users from enterprise users using command behavior. intuitive explanation why is still work-in-progress Questions remain: (follow-on work) How will this masquerade detection technique work in a production environment? What is the sensitivity of these results? How do SVM results compare with other techniques? Will masquerade detection change attacker behavior?

19. Questions?

A First Step Toward Detecting SSH Identity Theft in HPC Cluster Environments: Discriminating Masqueraders Based on Comm

A First Step Toward Detecting SSH Identity Theft in HPC Cluster Environments: Discriminating Masqueraders Based on Comm

Presentation Transcript

Identity theft

Identity Theft

Identity Theft

Papers on Web-based Fraud and Identity Theft

Identity Theft

Identity Theft

Identity Theft

Identity Theft

Identity Theft

Identity Theft

Identity Theft

IDENTITY THEFT

IDENTITY THEFT

Identity Theft

IDENTITY THEFT:

Identity Theft

Identity Theft

Identity Theft

Identity Theft

IDENTITY THEFT

Identity Theft

Identity Theft