200 likes | 365 Views
Cyber Analytics Project. Team: Desert Cyber Police Team Members and Roles Jagdeesh Narayanan - Leader Prajwal Shirurkar - Researcher Sagnik Roy Chowdhury - Researcher Krishna Pavan Bhat – Analytics Lead. Introduction . Project Overview A cyber analytics research project Objective
E N D
Cyber Analytics Project • Team: Desert Cyber Police • Team Members and Roles • Jagdeesh Narayanan - Leader • PrajwalShirurkar- Researcher • Sagnik Roy Chowdhury- Researcher • Krishna Pavan Bhat– Analytics Lead
Introduction • Project Overview • A cyber analytics research project • Objective • To conduct research on the data related to users of hacker web forums • Provide analytics and insights into the world of hackers and help in reducing cyber crimes • Identify trends on How, Why, When and Where the attacks happened
Research questions • Hacker web data set questions • How sensitive (potentially dangerous) are the discussions happening in hacker forums? • Which are top antivirus software frequently discussed amongst these communities and how is their time series trend looking like? • What are the trends of DDOS attack warnings or threats posted across forums through different timelines? • What are the trending topics and programming languages that have been widely discussed across forums? • Shodan data set questions • Which areas around the world have apache servers which give out their Geographical locations, exposing this vulnerability for a future attack?
Sample dataset • Hacker Web – raw data from My SQL database
Sample dataset • Shodan - data extracted using ShodanWebAPI
Hacking trends and activities • Question 1:- What is the sensitivity of each forum in Hacker web? • What does sensitivity of a forum mean? • Sensitivity of a Forum = Sensitive posts / Total Number of posts • Keywords for sensitivity are made public by the government • http://www.rense.com/general66/scgh.htm • Select data set corresponding to discussions involving sensitive information • Utilized 5 English speaking forums to collect data • Ran SQL queries to filter discussions with “sensitive” keywords in it • The results were filtered according to time for further analysis • Analytics through Tableau Desktop and Microsoft Excel • Observed antivirus discussion trends across all forums for years between 2009-2013 • Used excel charts pivot table, Tableau to visualize data on antivirus popularity and even across years • Determined that Avira, AVG and Avast are the top antiviruses of concern to the community • Determined that new software are popping up in the discussion forums showing increased concern over better security functionalities
Hacking trends and activities • Question 2:- Which are top antivirus software frequently discussed amongst these communities and how is their time series trend looking like? • What are the antivirus software of interest in the discussions of Hacker forums? • Select data set corresponding to discussions in the forums involving antivirus software • Narrowed down on 5 English forums in Hacker web Database • Use data pertaining to 15 antiviruses famous in the market(http://anti-virus-software-review.toptenreviews.com/) • The results were filtered according to different forums • Analytics through Tableau and Microsoft Excel • Observed antivirus discussion trends across all forums for years between 2009-2013 • Used excel charts pivot table, Tableau to visualize data on antivirus popularity and even across years • Determined that Avira, AVG and Avast are the top antiviruses of concern to the community • Determined that new software are popping up in the discussion forums showing increased concern over better security functionalities
Hacking trends and activities • What are the recent DDOS threats that have been made? • Attack warnings, threats and information about an attack that might have happened • Select understandable data with rich information • Narrowed down on 5 English forums • Used data pertaining to post with data on attack threats & warnings • Streamlining of data based on the authenticity of the attack warning. • Analytics through Tableau Desktop • Observed attack activities by year • Visualized frequency of posts being discussed by hackers • Determined the specific attacks that specific authors are speaking about • Question 3:- What are the trends of DDOS attack warnings or threats posted • across forums through different timelines?
Hacking trends and activities • What is the emerging trends in the world of Hackers? • Activity, hot topic, popular programming languages • Select understandable data with rich information • Narrowed down on 5 English forums • Used data pertaining to Authors with top 10 reputation scores • Further streaming of important data by using NoOfViews (In-link concept) • Analytics through Tableau Desktop • Observed increased hacker activities by year • Visualized hot technology topics being discussed by hackers • Determined C&C++ as most popular programming language • Question 4:- What are the trending topics and programming languages that have been widely discussed across forums?
Hacking trends and activities • What are some of the vulnerability factors of a system connected over the internet? • Their IP and geographical locations exposed to the outer world • Select understandable data with rich information • Used Shodan as the source for data • Retrieved data pertaining to research through python API • Retrieved IP addresses, longitude and latitude of apache servers across the world • Analytics through Tableau Desktop • The data after being fetched to excel spreadsheet was run through Tableau • The geographical locations of the servers were mapped on a world map • Determined that USA, South East Asia and Western Europe had high concentration of vulnerable apache servers • Question 5:- Which geographical locations have vulnerable Apache servers installed which gives out its IP, longitudes and latitudes?
Summary • Conducted extensive research on Hacker Web data set • Research was also done on Shodan data set • Data concerning only the most reputed hackers used • Important research questions selected and answered with deep insights and analytics • Prediction of increase in hacker activities to using specific programming languages and methods