420 likes | 555 Views
On the Semantic Patterns of Passwords and their Security Impact. Rafael Veras , Christopher Collins, Julie Thorpe University of Ontario institute of Technology Presenter: Kyle Wallace. A Familiar Scenario…. User Name:. CoolGuy90. Password:. “ What should I pick as my new password ?”.
E N D
On the Semantic Patterns of Passwords and their Security Impact Rafael Veras, Christopher Collins, Julie Thorpe University of Ontario institute of Technology Presenter: Kyle Wallace
A Familiar Scenario… User Name: CoolGuy90 Password: “What should I pick as my new password?”
A Familiar Scenario… “Musical!Snowycat90”
A Familiar Scenario… • But how secure is “Musical!Snowycat90” really? (18 chars) • “Musical” – Dictionary word, possibly related to hobby • “!” – Filler character • “Snowy” – Dictionary word, attribute to “cat” • “cat” – Dictionary word, animal, possibly pet • “90” – Number, possibly truncated year of birth • 15/18 characters are related to dictionary words! Why do we pick the passwords that we do?
Password Patterns? • “Even after half a century of password use in computing, we still do not have a deep understanding of how people create their passwords” –Authors • Are there ‘meta-patterns’ or preferences that can be observed across how people choose their passwords? • Do these patterns/preferences have an impact on security?
Contributions • Use NLP to segment, classify, and generalize semantic categories • Describe most common semantic patterns in RockYou database • A PCFG that captures structural, semantic, and syntactic patterns • Evaluation of security impact, comparison with previous studies
Contributions • Use NLP to segment, classify, and generalize semantic categories • Describe most common semantic patterns in RockYou database • A PCFG that captures structural, semantic, and syntactic patterns • Evaluation of security impact, comparison with previous studies
Segmentation • Decomposition of passwords into constituent parts • Passwords contain no whitespace characters (usually) • Passwords contain filler characters (“gaps”) between segments • Ex: crazy2duck93^ -> {crazy, duck} & {2,93^} • Issue: What about strings that parse multiple ways?
Coverage • Prefer fewer, smaller gaps to larger ones • Ex: Anyonebarks98 (13 characters long)
Splitting Algorithm • Source corpora: Raw word list • Taken from COCA (Contemporary Corpus of American English) • Trimmed version of COCA: • 3 letter words: Frequency of 100+ • 2 letter words: Top 37 • 1 letter words: a, I • Also collected list of names, cities, surnames, months, and countries
Splitting Algorithm • Reference Corpus: Collection of N-Grams, where N=3 (Full COCA) • N-Gram: Sequence of tokens (words) • Ex: “I love my cats” • Unigrams: I, love, my, cats (4) • Bigrams: I love, love my, my cats (3) • Trigrams: I love my, love my cats (2)
Part-of-Speech Tagging • Necessary step for semantic classification • Ex: “love” is a noun (my true love) and a verb (I love cats) • Given segments , returns • Gap segments are not tagged
Semantic Classification • Assigns a semantic classifier to each password segment • Only assigned to nouns and verbs • WordNet: A graph of concepts expressed as a set of synonyms • “Synsets” are arranged into hierarchies, more general at top • Fall back to source corpora for proper nouns • Tag with female name, male name, surname, country, or city
Semantic Classification Tags represented as word.pos.#, where # is the WordNet ‘sense’
Semantic Generalization • Where in the synset hierarchy should we represent a word? • Utilize a tree cut model on synset tree • Goal: Optimize between parameter & data description length
Contributions • Use NLP to segment, classify, and generalize semantic categories • Describe most common semantic patterns in RockYou database • A PCFG that captures structural, semantic, and syntactic patterns • Evaluation of security impact, comparison with previous studies
Classification • RockYou leak (2009) contained over 32 million passwords • Effect of generalization can be seen in a few cases (in blue) • Some generalizations better than others (Ex: ‘looted’ vs ‘bravo100’) • Some synsets are not generalized (in red) • Ex: puppy.n.01 -> puppy.n.01
Summary of Categories • Love (6,7) • Places (3, 13) • Sexual Terms (29, 34, 54, 69) • Royalty (25, 59, 60) • Profanity (40, 70, 72) • Animals (33, 36, 37, 92, 96 100) • Food (61, 66, 76, 82, 93) • Alcohol (39) • Money (46, 74) • *Some categories expanded from two letter acronyms • +Some categories contain noise from names dictionary
Contributions • Use NLP to segment, classify, and generalize semantic categories • Describe most common semantic patterns in RockYou database • A PCFG that captures structural, semantic, and syntactic patterns • Evaluation of security impact, comparison with previous studies
Probabilistic Context-Free Grammar • A CFG whose productions have associated probabilities • A vocabulary set (terminals) • A variable set (non-terminals) • A start variable • A set of rules (terminals + non-terminals) • A set of probabilities on rules, such that
Semantic PCFG • In the author’s PCFG: • is comprised of the source corpora and learned gap segments • is the set of all semantic and syntactic categories • All rules are of the form , or (nonterminals) • This grammar is regular (described by a finite automaton)
Sample PCFG • Training data: • iloveyou2 • ihatedthem3 • football3 • rules are base structures • Grammar can generate passwords • Probability of a password is the product of all rule probabilities • Ex: P(youlovethem2) = 0.0103125
Contributions • Use NLP to segment, classify, and generalize semantic categories • Describe most common semantic patterns in RockYou database • A PCFG that captures structural, semantic, and syntactic patterns • Evaluation of security impact, comparison with previous studies
Building a Guess Generator • Cracking attacks consist of three steps: • Generate a guess • Hash the guess using the same algorithm as target • Check for matches in the target database • Most popular methods (using John the Ripper program) • Word lists (from previous breaks) • Brute force (usually after exhausting word lists)
Guess Generator • At a high level: • Output terminals in highest probability order • Iteratively replaces higher probability terminals with lower probability ones • Uses priority queue to maintain order • Will this produce the same list of guesses every time?
Guess Generator Example • Suppose only one base structure: • Initialized with most probable terminals: “I love Susie’s cat” • Pop first guess off queue (“IloveSusiescat”) • Replace first segment: “youloveSusiescat” • Replace second segment: “IhateSusiescat” • Replace third segment: “IloveBobscat” • Replace fourth segment: “IloveSusiesdog”
Mangling Rules • Passwords aren’t always strictly lowercase • Beardog123lol • bearDOG123LoL • BearDog123LoL • Three types of rules: • Capitalize first word segment • Capitalize whole word segment • CamelCase on all segments • Any others?
Comparison to Weir Approach • Author’s approach seen as an evolution of Weir • Weir contains far fewer non-terminals (less precise estimates) • Weir does not learn semantic rules (fewer overall terminals) • Weir treats grammar and dictionary input separately • Authors semantic classification needs to be re-run for changes
Password Cracking Experiments • Considered 5 methods: • Semantic approach w/o mangling rules • Semantic approach w/ custom mangling rules • Semantic approach w/ JtR’s mangling rules • Weir approach • Wordlist w/ JtR’s default rules + incremental brute force • Attempted to crack LinkedIn and MySpace leaks
Experiment 1: RockYou vs LinkedIn • 5,787,239 unique passwords • Results: • Semantic outperforms non-semantic versions • Weir approach is worst (67% improvement) • Authors approach is more robust against differing demographics
Experiment 2: RockYou vs MySpace • 41,543 unique passwords • Results: • Semantic approach outperforms all • No-rules performs best • Weir approach is worst (32% improvement) • Password were phished, quality lowered?
Experiment 3: Maximum Crack Rate • Since method is based on grammar, can build grammar recognizer to check • Results: • Semantic equivalent to brute force, with fewer guesses • Weir approach generates fewer guesses, 30% less guessed
Experiment 3: Time to Maximum Crack • Fit non-linear regression to sample of guess probs. • Results: • Semantic method has lower guess/second • Grammar is much larger than Weir method
Issues with Semantic Approach • Further study needed into performance bottlenecks • Though semantic method is more efficient(high guesses/hit) • Approach requires a significant amount of memory • Workaround involves probability threshold for adding to queue • Duplicates could be produced due to ambiguous splits • Ex: (one, go) vs (on, ego)
Conclusions • There are underlying semantic patterns in password creation • These semantics can be captured in a probabilistic grammar • This grammar can be used to efficiently generate probable passwords • This generator shows (up to) a 67% improvement over previous efforts
Thank you! Questions?