180 likes | 325 Views
Automated Puzzle Generation. Simon Colton Universities of Edinburgh and York. Background. Train journey with Jeremy Gow To meet Herb Simon Puzzle generation rather than problem solving Wrote some puzzles for Jeremy Jeremy kept getting the “wrong” answer
E N D
Automated Puzzle Generation Simon Colton Universities of Edinburgh and York
Background • Train journey with Jeremy Gow • To meet Herb Simon • Puzzle generation rather than problem solving • Wrote some puzzles for Jeremy • Jeremy kept getting the “wrong” answer • Puzzle generation is a difficult task • Reviewer’s comment • View puzzles independently of implementation
Some Example Puzzles • Which is the odd one out? • Hair, triangles, squares, plants, words, trees • Answer: triangles (others have roots) • Jingle is to corporation as ? Is to politician • Campaign, platform, slogan, promises • Answer: slogan • What is next in the sequence • 4, 3, 6, 6, 2, 9 ? • Answer: later
Overview of What’s Needed • Structure for puzzles • Characterisation of puzzles • Puzzles must have single solutions • Theory formation helps here • Puzzles must be of correct difficulty • Methods for disguising the answer
Queendom.com Examples • What’s the odd one out? • Coconuts, oysters, clams, eggs, walnuts, haddock • A: haddock (the others have shells) • Hair is to stubble as potatoes are to ? • F.fries, sweet potatoes, potato skins, vegetable • A. French fries • What’s next in the sequence • 3, 8, 15, 24, 35? • A: 48 (square integers and subtract 1)
A Characterisation of Puzzles • Three (of many) types of puzzle are: • Odd one out, analogy, next in sequence • Have (almost) the same structure: • Question statement • Set of choices, one of which is answer • Solution which is an embedded concept • Some tweaking necessary to make a fit • Next in sequence puzzles have no choices • Analogy puzzles have no solution concept
Solutions to Puzzles • Solution is a single embedded concept • Fairly simple and positively stated • Which is the odd one out: 4, 9, 8, 36? • A: 9 (even numbers), A: 8 (square numbers) • Puzzle is unsatisfying if there are two answers • Which is the odd one out: 2, 3, 9, 20? • A: 9 (it is a square number) • Which is the odd one out: 23, 25, 27, 29? • A: 27 (others are primes or squares)
The Difficulty of Puzzles • Embedded concept is usually not complex • Probably in order to ensure single solution • Number of possible answers • Increases the search space for answer • Could make the problem easier • Disguising concepts • Odd one out: haddock puzzle, they’re all foodstuffs • Next in sequence (from queendom): 2, 7, 4, 14, 6? • Another concept interleaved (or stuck on)
The HR Program • Automated theory formation • Concepts (ex. & def.), conjectures, proofs • Theory is a collection of concepts (in this case) • Concept formation via 8 production rules • Builds new concepts from old ones • Compose,disjunct,exists,forall,match,negate,size,split • Complexity of a concept: • Number of production rule steps • Specialisation concepts important • Specialistion of objects of interest (e.g., prime nums)
Extension for Puzzles (General) • HR generates theory, then builds puzzles • Embed each concept, make all puzzles, choose rep. • From characterisation of solution: • Don’t use negate or disjunct production rules in ATF • From single solution: • Exhaust theory up to a complexity limit • Check for alternative solutions and discard • From difficulty consideration • Present puzzles in order of conc. complexity, disguise • Actively add disguise where possible
Extension for Puzzles (Special) • User: chooses the number of possible answers (n) • Answers are presented in random order • Odd one out: • Choose n positive and 1 negative example of spec. conc • Check all other concepts for a different solution • Next in sequence (only in domain of integers) • Embed number type (e.g. primes, 2, 3, 5, 7, ?) • Embed function (e.g. number of divisors, 1, 2, 2, 3, ?) • Actively disguise by interleaving simple seq. • Analogy: A is to B as C is to: D, E, F, G? • A, B, C and D share spec. property, E, F and G do not
Experiment 1: Animals • Animals dataset (distributed with Progol) • 18 animals (dog, platypus, snake, eagle, etc.) • 12 properties (class, homeothermic, eggs, etc.) • Theory formation up to comp. limit 5 • Compose, exists, forall, match, size, split • Asked for all odd one out & analogy puzzles • User specifies: 4 answers possible
Animals Results • 31 puzzles about animals formed • Good examples [15] Which is OOO: penguin, ostrich, cat, bat? [31] Eel is to platypus as shark is to snake,eagle,turtle,lizard? • Bad example [27] Cat is to dog as eagle is to lizard, eel, ostrich, trout? • Observations: • Low complexity of concepts, little disguise found • Need more examples of animals • Conclusion: • Single solutions worked OK, but fairly easy to solve
Experiment 2: Integer Sequences • Integers 1 to 30 provided • Addition, multiplication, digits, divisors • Compose, exists, match, size, split • Theory formed up to complexity 4 • Disguise simple concepts (comp. < 3) • By interleaving other simple concepts • All next in sequence puzzles asked for • User specifies: 6 terms of the sequence given
Sequences Results • 24 next in sequence puzzles generated • Good examples: [2] 4, 3, 6, 6, 2, 9, ? [numdiv, 27, mult. of 3] [3] 21, 3, 24, 6, 27, 9, ? [mult 3, mult 3] [10] 21, 22, 24, 25, 26, 28, ? [digit is a div] • Bad examples: [20] 6, 0, 2, 0, 4, 0, ? [# even divisors of 24, …] [22] 11, 12, 12, 13, 13, 14, ? • Observations • Functions should start earlier on number line • Embedded concepts are in general too complex
Remarks about Creativity • Setter: creative act is finding concept/examples • Solver: creative act is finding the answer/solution • Having a single solution: • Want the solver to be P-creative, not H-creative • Difference between answer and solution • IQ tests: interested in answer, not solution • More will come to light after field testing • Comments very welcome
Conclusions and Future Work • Characterisation of puzzles • Single pos. simp. solution, difficulty (disguise) • Puzzle generation can be automated • Results not stunning, but still preliminary • Puzzle generation needs improvement • Also needs hand crafting of input files • More answers/questions about puzzle solver/setter creativity • After a field test of HR’s puzzles