200 likes | 393 Views
Group 5. Abhishek Das, Bharat Jangir. Project Overview. We received a total responses of 119 responses. The division of the responses were as follows: – 73 surveymonkey.com – 26 Facebook and Reddit – 20 Friends, Email and on paper responses.
E N D
Group 5 Abhishek Das, Bharat Jangir.
Project Overview • We received a total responses of 119 responses. • The division of the responses were as follows: – 73 surveymonkey.com – 26 Facebook and Reddit – 20 Friends, Email and on paper responses. • We first divided the task of collection from various sources in different parts: – Bharat Jangir from Surveymonkey.com – Abhishek Das from Facebook, Reddit – Kevin Talmagde from Friends, Email and on paper
Surveymonkey.com Benefits • Easy to use • Secured • Trusted website • We can format the results obtained in various different ways. • Easy registration process and free
Survey Overview. • We decided to keep the survey short to increase the number of responses otherwise people tend to lose interest. We wanted questions to be ubiquitous. • We then trimmed our question set from 18 questions to 6 question. We did this by giving each question a score based on easiness and guaranteed level of response (1-5).
Questionnaire • There were total of only 6 questions: –Age –Gender –Highest level of education –Do you use Antivirus? If so, which one? –Reuse of username –Reuse of password
Outcome • We received a total responses of 119 in about 7 weeks time. Majority of the results came in from the initial weeks. Less number of females versus males who were open to taking surveys. • Validity? - Completely anonymous - Website is secure - Plenty of time to publicize about the survey
More about algorithms and implementation • Password Reuse and use of anti virus: Initially by plotting a pie chart on the data set we found out that people do reuse passwords for more than one websites. We plotted a decision tree to support our claims. • C4.5 / j48 algorithm was used to generate this decision tree. • This algorithm classifies the attributes on the basis of entropy.
At each node it chooses an element that effectively splits the data on the basis of information gain. • Higher the information gain, closer it lies to the node. • Nodes are then split and re-split till the information gain is 0 or we reach end of splitting attributes. • Education level affecting the use of antivirus: The table below shows how education level affects the user whether or not to use anti virus for protection.
More about algorithms and implementation • Apriori algorithm was used learn the association rule learning. • It identifies the frequent datasets and extends them to larger item sets as long as those item appear frequently. • This frequent item set is determined by Apriori rule to determine association rule. • Example - Market analysis basket.
J48 classification tree algorithm in use to classify population for the use of anti virus vs education level.
References: 1) Blog report - Cyber security survey shows low internet security confidence across EU. http://www.stemgroup.co.uk/news/cyber-security-survey-shows-low-internet-security-co nfidence-across-eu 2) Password reuse opens doors for cyber criminals End-users must have a different password for every website and security domain – Feb-15,2011. http://www.infoworld.com/d/security-central/password-reuse-opens-doors-cyber-crimin als-457 3) University of Wakito, New Zealand, Weka documentation http://grepcode.com/file/repo1.maven.org/maven2/nz.ac.waikato.cms.weka/weka-dev/3.7.9/w eka/classifiers/trees/J48.java#J48 4) University of Wakito, New Zealand, Weka documentation http://grepcode.com/file/repo1.maven.org/maven2/nz.ac.waikato.cms.weka/predictiveApriori/1 .0.2/weka/associations/PredictiveAprioriTest.java#PredictiveAprioriTest