200 likes | 329 Views
Hacker Influence and SCADA Devices – A Temporal Shift Approach to Risk Identification. MIS 510, Cyber Security Project, Spring 2014. Michael Byrd, Eric Case, Bradley Dorn, Wenli Zhang . Introduction. The temporal dimension of data is often overlooked in network analysis
E N D
Hacker Influence and SCADA Devices – A Temporal Shift Approach to Risk Identification MIS 510, Cyber Security Project, Spring 2014 Michael Byrd, Eric Case, Bradley Dorn, Wenli Zhang
Introduction • The temporal dimension of data is often overlooked in network analysis • Additional richness can be gleaned from how network attributes change over time • Automated techniques can be quite useful in identifying and analyzing temporal trends • In this study, we focus on hacker influence and the connectivity of vulnerable devices • We leverage the data in Hackerweb to examine how changes in network centrality over time are related to the topics discussed • We examine Shodan data to learn about the temporal trends in the presence of Supervisory, Control, and Data Acquisition (SCADA) devices on the Internet • Stuxnet
Data Collection: Forum Selection • Plotted posts by day over time • Selected three forums that were representative by language and post counts • Selected 2010 for analysis due to good coverage in data
Data Collection: Extraction and cleansing • Created SQL queries to extract data for the three selected forums from Hackerweb • Cleansed date formats extensively using SQL and Perl • Cleansed text: • Remove variations in case • Remove URLs, hashtags, unnecessary punctuation, excess punctuation • Remove stop words • Standardize references to other users • Remove some typos (such as repeated letters) • Remove words that do not start with letters • Ran queries on Shodan using python scripts
Analysis: Overview • Hacker Influence – Network Centrality • Topic Analysis – Hacker Topic Profiles • Topic Analysis – Important Forum Topics • Connection of SCADA devices
Analysis: Hacker Influence • Used network centrality as a proxy for influence • Centrality is a descriptive metric that can be indicative of a particular node’s connectivity across a network • Previous research has shown it to be related to influence in a social network setting • We calculated eigenvector centrality for all users • We then selected users with discrete transitions in centrality • From low or no centrality to high • Next three slides illustrate the selected users and their centrality scores across the three forums analyzed
Analysis: Topic Analysis • Extracted topics • Frequently used words • Frequent word associations • Bigrams • For individuals • Examine topics in the months before increases in influence (centrality) • Compare and contrast with topics in the months following increases • Results presented in slides 12 through 14 • Follow up analysis – broader topic baseline • Extract topics for the top 5 or 10 users across the entire forum • Allows insight into baseline topics for comparison with individual trends • Results on slide 15
Results: Topics • There is an obvious permeation of administrative topics. • Many users are discussing logistic details like how to post and register • Future research should work to filter these types of posts out • There are a large number of topics related to the acquisition of software • While this could potentially be interesting as an area to explore independently, it would probably make the results of our individual-shift analysis more interesting if these types of topics could be filtered out • This would lead to more emphasis on the actual type of hack – where methods and targets might be discussed, for example – instead of logistical issues of acquiring tools • There is evidence of an external event impacting the discussions on the forum – the 2010 earthquake in the Sichuan province in China • This illustrates the connection between forums frequented by hackers with the broader social context • Information on the earthquake was heavily censored in the mainstream Chinese media, the hacker forums are more difficult for the state to control or censor and as such may provide an important parallel mechanism for social discourse
Results: SCADA • SCADA devices continue to be present • The number of SCADA devices continues to increase
Discussion • We found that shifts in network centrality can be used to identify the timing of increases of influence for hackers (Research question 1) • Topic analysis during periods of increasing hacker importance can be used to develop representative hacker profiles (Research question 2) • Individual shifts in influence can be better understood and can trends of broader significance can be found by investigating topics frequently discussed by influential hackers (Research question 3) • By examining data from Shodan, we were able to identify and analyze temporal trends in the connection of SCADA devices to the Internet (Research question 4) • Taken together these results illustrate the power of examining temporal trends in data – not simply to identify changes in the data itself, but as a tool to better understand the dynamic nature of human and information systems
Conclusions • Identifying hackers and providing context to their most influential actions is a meaningful and automatable exercise. In this study we have demonstrated the possibility of conducting such an analysis in an automated fashion • While our study is subject to limitations given the timeframe and data set, the methodology could easily be extended in a more robust exercise • We believe our set of results is indicative of the types of results that would be found with other similar data sets • The techniques used in this study can also be applied with little modification to data collected about hacker communities of any type to enhance the generalizability of the findings