180 likes | 389 Views
Motivation.
E N D
1. A First Step Toward Detecting SSH Identity Theft in HPC Cluster Environments: Discriminating Masqueraders Based on Command Performance
2. Motivation SSH Identity Theft - an Open Problem
user has SSH password or unencrypted keys stolen
Masquerade Detection an Open Problem
Detecting when stolen authentication credentials allow an attacker to masquerade as a legitimate user
3. Motivation-2 SSH Identity Theft - an Open Problem
user has SSH password or unencrypted keys stolen
Masquerade Detection an Open Problem
Detecting when stolen authentication credentials allow an attacker to masquerade as a legitimate user
How?
capturing logins and confirming back with users
sample and inspect
syslog inspection
4. Overview Process Accounting
Experimental Design
Results
Conclusions
5. Overview Process Accounting
Experimental Design
Results
Conclusions
6. Process Accounting A method of recording and summarizing commands executed on systems
originally created for billing purposes
automation supporting human monitoring
7. Overview Process Accounting
Experimental Design
Results
Conclusions
8. Experimental Design The Data
Technique: Support Vector Machine (SVM)
9. Overview Process Accounting
Experimental Design
Results
Conclusions
10. Results-1 Put TP FP and overall accuracy here first.
229 samples
75 cluster
154 internet
Only 6 misidentified
Accuracy=
precision-=
Recall=Put TP FP and overall accuracy here first.
229 samples
75 cluster
154 internet
Only 6 misidentified
Accuracy=
precision-=
Recall=
11. Results-2
12. Results-3
13. Results-4 Same scale for both, 1-99, 100-199, 200-299, 300-399, 400-499,
1K-10K,>10K
Similar distributions but Total # of commands cluster 1,834,905 (4X Internet); with five(1K-10K), eight(>10K)
While Total # of commands Internet 538,000; with 27(1K-10K), 2(>10K)
Same scale for both, 1-99, 100-199, 200-299, 300-399, 400-499,
1K-10K,>10K
Similar distributions but Total # of commands cluster 1,834,905 (4X Internet); with five(1K-10K), eight(>10K)
While Total # of commands Internet 538,000; with 27(1K-10K), 2(>10K)
14. Results-5 Figures 5 and 6 show a visualization of command popularity for HPC clusters and Internet servers respectively. This visualization was first developed in [Schonlau2001] and has appeared in subsequent masquerade detection papers [ ]. Within this visualization, a vertical panel correspond to one user, thus we have 75 and 154 panels for HPC cluster and Internet server users respectively. Within each user panel, each command is represented by a single dot and assigned a distinct vertical position ordinate based on popularity such that higher command popularities have larger Y-axis ordinates. Commands are also plotted versus time within each panel, the same command used multiple times is represented as a horizontal line starting within a panel at the ordinate given at its first entry time.
The use of uniqueness metric for individual commands is a useful concept to understand how the SVM machine learning makes user classifications. For masquerade detection, commands not previously seen in a data set may indicate an attempted masquerade.
Uniqueness = 1 - (total users of a command/total users)Figures 5 and 6 show a visualization of command popularity for HPC clusters and Internet servers respectively. This visualization was first developed in [Schonlau2001] and has appeared in subsequent masquerade detection papers [ ]. Within this visualization, a vertical panel correspond to one user, thus we have 75 and 154 panels for HPC cluster and Internet server users respectively. Within each user panel, each command is represented by a single dot and assigned a distinct vertical position ordinate based on popularity such that higher command popularities have larger Y-axis ordinates. Commands are also plotted versus time within each panel, the same command used multiple times is represented as a horizontal line starting within a panel at the ordinate given at its first entry time.
The use of uniqueness metric for individual commands is a useful concept to understand how the SVM machine learning makes user classifications. For masquerade detection, commands not previously seen in a data set may indicate an attempted masquerade.
Uniqueness = 1 - (total users of a command/total users)
15. Results-6 Similar, Cluster 300 unique commands, most unique
Internet 100 unique commands, (Internet activity)Similar, Cluster 300 unique commands, most unique
Internet 100 unique commands, (Internet activity)
16. Results-7 Distintct commands usage, different scales Internet90 much higher Cluster25
Cluster = 426 different commands (unique cluster commands) tri-modal, (20-29 peak)
Internet= 263 different commands (10-19 peak, much higher) command usage more commonDistintct commands usage, different scales Internet90 much higher Cluster25
Cluster = 426 different commands (unique cluster commands) tri-modal, (20-29 peak)
Internet= 263 different commands (10-19 peak, much higher) command usage more common
17. Overview Process Accounting
Experimental Design
Results
Conclusions
18. Conclusions empirical results show it is feasible to effectively distinguish cluster users from enterprise users using command behavior.
intuitive explanation why is still work-in-progress
Questions remain: (follow-on work)
How will this masquerade detection technique work in a production environment?
What is the sensitivity of these results?
How do SVM results compare with other techniques?
Will masquerade detection change attacker behavior?
19. Questions?