1 / 40

CYK Algorithm & CFL reachability

CYK Algorithm & CFL reachability. By - Lohit Krishnan Chetas Mahajan. Outline. CYK Algorithm Background Problem statement. Intuition Terminologies Formal description and example. Background. Named after C ocke, Y ounger K asami. Some fascinating qualities:

grazia
Download Presentation

CYK Algorithm & CFL reachability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CYK Algorithm & CFL reachability By - Lohit Krishnan Chetas Mahajan

  2. Outline • CYK Algorithm • Background • Problem statement. • Intuition • Terminologies • Formal description and example

  3. Background • Named after Cocke, Younger Kasami. • Some fascinating qualities: • It shows that deciding if s ϵ L(G) is in P for any CNF CFG G. • Uses a “dynamic programming” or “table-filling algorithm” which solves decision problem.

  4. Problem Statement • Given the CFG G : S -> AB | BC A -> BA | a B -> CC | b C -> AB | a • L be the language generated by G. • Is the string “baaba”, a valid member of the L ? • How many substrings of “baaba” are valid members of L ? • How many distinct substrings of the given string are valid members of L ? • How many non-empty substrings of the given string are not valid members of L ? • How many substrings of the given string are only generated by B ?

  5. Problem Statement • Given a context-free grammar G and a string w • G = (V, Σ ,P , S) where • V finite set of variables • Σ (the alphabet) finite set of terminal symbols • P finite set of rules • S start symbol (distinguished element of V) • V and Σ are assumed to be disjoint • G is used to generate the strings of language L • Does w ϵ L(G) ?? (Membership Problem)

  6. Terminology • Let n be the length of the string w. • Partition the given string using n+1 lines. • Number those lines from 0 to n. • Now, we define • xijas the substring of the string w which lies between the lines i and j. (Here i < j). • Tij be the set of non-terminals which generate string xij

  7. Terminology • Grammar : S-> AB | BC A -> BA | a B -> CC | b C -> AB | a • String to be checked is “baaba”. • x13 = aa • x35 = ba • x05 = baaba • T23 = Non-terminals generating x23 (i.e “a”). • T23 = { A, C } 0 1 2 3 4 5 • Build a table T of Tij , 0 ≤ i ≤ n -1 ; • 1 ≤ j ≤ n ; • i < j

  8. Intuition of the algorithm • Tij are the subproblems of Dynamic Programming. • In this problem, we need to decide whether the start symbol belongs in T0n. • Formation of DP: - • T(T1T2) = { X | X->t1t2 and t1 ϵ T1 and t2ϵ T2 } • Tij = U T(TikTkj) j-1 k = i+1

  9. 0 1 2 3 4 5

  10. 0 1 2 3 4 5 S -> AB | BC A -> BA | a B -> CC | b C -> AB | a

  11. 0 1 2 3 4 5 S -> AB | BC A -> BA | a B -> CC | b C -> AB | a

  12. 0 1 2 3 4 5 S -> AB | BC A -> BA | a B -> CC | b C -> AB | a

  13. 0 1 2 3 4 5 S -> AB | BC A -> BA | a B -> CC | b C -> AB | a

  14. 0 1 2 3 4 5 S -> AB | BC A -> BA | a B -> CC | b C -> AB | a

  15. 0 1 2 3 4 5 S -> AB | BC A -> BA | a B -> CC | b C -> AB | a

  16. 0 1 2 3 4 5 S -> AB | BC A -> BA | a B -> CC | b C -> AB | a

  17. 0 1 2 3 4 5 S -> AB | BC A -> BA | a B -> CC | b C -> AB | a

  18. 0 1 2 3 4 5 S -> AB | BC A -> BA | a B -> CC | b C -> AB | a

  19. 0 1 2 3 4 5 S -> AB | BC A -> BA | a B -> CC | b C -> AB | a

  20. 0 1 2 3 4 5 S -> AB | BC A -> BA | a B -> CC | b C -> AB | a

  21. Answers • Is the string “baaba”, a valid member of the L ? • Yes !!

  22. 0 1 2 3 4 5

  23. Answers • Is the string “baaba”, a valid member of the L ? • Yes !! • How many substrings of “baaba” are valid members of L ? • 5

  24. 0 1 2 3 4 5

  25. Answers • Is the string “baaba”, a valid member of the L ? • Yes !! • How many substrings of “baaba” are valid members of L ? • 5 • How many distinct substrings of the given string are valid members of L ? • 4

  26. 0 1 2 3 4 5

  27. Answers • Is the string “baaba”, a valid member of the L ? • Yes !! • How many substrings of “baaba” are valid members of L ? • 5 • How many distinct substrings of the given string are valid members of L ? • 4 • How many non-empty substrings of the given string are not valid members of L ? • 15 – 5 = 10

  28. Answers • Is the string “baaba”, a valid member of the L ? • Yes !! • How many substrings of “baaba” are valid members of L ? • 5 • How many distinct substrings of the given string are valid members of L ? • 4 • How many non-empty substrings of the given string are not valid members of L ? • 15 – 5 = 10 • How many substrings of the given string are only generated by B ? • 5

  29. 0 1 2 3 4 5

  30. CFL Reachability

  31. Outline • CFL reachability • Motivation • Problem definition • Variants of CFL Reachability problem • Relation with other Problems • Algorithm • Example

  32. Motivation “Program Analysis via Graph-reachability” By Thomas Reps

  33. Motivation • Program analysis requires extraction of information from a program without actually running it. • Classical data-flow analysis maintains set of “dataflow facts” with each program point. • Program analysis  Graph Reachability problem(GRP) • GRP is a special case of CFL Reachability problem.

  34. Problem Definition • Let L be a context-free language over alphabet ∑, and let G be a graph whose edges are labeled with members of ∑. • Each path in G defines a word over ∑*, namely, the word obtained by concatenating, in order, the labels of the edges on the path. A path in G is an L-path if its word is a member of L.

  35. Variants of CFL Reachability Problem • The all-pairs L-path problem. • The single-source L-path problem. • The single-target L-path problem • The single-source/single-target L-path problem. • Other Variants : Multi-source L-path problem, the multi-target L-path problem, and the multi-source/multi-target L-path problem

  36. Example • L be the language that consists of strings of matched parentheses and square brackets, with zero or more e’s inside it. • Only one L-Path : [(e[])eee[e]]

  37. Relation with other problems • Ordinary Graph Reachability Problem • Put all the labels as e, and L = e* • CFL Recognition Problem • “Given a string w and a context-free language L, is w ϵL?” • Create a linear graph s →... → t, that has |w| edges, and label the ith edge with the ith letter of w. • There is an L-path from s to t iff w ϵL.

  38. Algorithm • Normalize the grammar so that the right-hand side of each production has at most two symbols (either terminals or nonterminals). • Add additional edges as shown in the figure below. • A ϵN B, C ϵ (N U T) • Solution can be obtained via edges labelled with Start Symbol of the Grammar.

  39. Example • Grammar : S-> AB | BC A -> BA | a B -> CC | b C -> AB | a • Graph G : • All pair L-Path Problem. b a a b

  40. Questions ??

More Related