570 likes | 707 Views
Syntactic analysis using Context Free Grammars. Analysis of language. Morphological analysis Chairs <chair, n, plural>, <chair, v, 3 rd person, present> Part Of Speech (POS) tagging The/DT man/NN left/VBD the/DT room/NN
E N D
Analysis of language • Morphological analysis • Chairs <chair, n, plural>, <chair, v, 3rd person, present> • Part Of Speech (POS) tagging • The/DT man/NN left/VBD the/DT room/NN • The/DT red/ADJ block/NN on/PREP the/DT blue/ADJ cylinder/NN was/AUX moved/VBD onto/PREP the/DT brown/ADJ table/NN • Any further analysis?
Analysis of language • Part Of Speech (POS) tagging • The/DT man/NN left/VBD the/DT room/NN • The/DT red/ADJ block/NN on/PREP the/DT blue/ADJ cylinder/NN was/AUX moved/VBD onto/PREP the/DT brown/ADJ table/NN • Any further analysis? • chunks, clauses, syntax, semantics, word senses etc… • Today’s lecture on analyzing syntax
What is Syntax? • Study of structure of language • how words can connect to each other • Specifically, goal is to relate surface form (i.e. the sentence) to semantics (the meaning) • Representational device is tree structure
Structure in StringsProposal 1 • Some words: the a small nice big very boy girl sees likes • Some good sentences: • (the) boy (likes a girl) • (the small) girl (likes the big girl) • (a very small nice) boy (sees a very nice boy) • Some bad sentences: • *(the) boy (the girl) • *(small) boy (likes the nice girl)
Structure in StringsProposal 2 • Some words: the a small nice big very boy girl sees likes • Some good sentences: • (the boy) likes (a girl) • (the small girl) likes (the big girl) • (a very small nice boy) sees (a very nice boy) • Some bad sentences: • *(the boy)(the girl) • *(small boy) likes (the nice girl)
More Structure in StringsProposal 2 -- ctd • Some words: the a small nice big very boy girl sees likes • Some good sentences: • ((the) boy) likes ((a) girl) • ((the) (small) girl) likes ((the) (big) girl) • ((a) ((very) small) (nice) boy) sees ((a) ((very) nice) girl) • Some bad sentences: • *((the) boy)((the) girl) • *((small) boy) likes ((the) (nice) girl)
likes boy girl the a From Substrings to Trees • (((the) boy) likes ((a) girl))
Context-Free Grammars • Terminals • This would be the lexicon/vocabulary • Non-Terminals • The constituents in a language • Like noun phrase, verb phrase, prepositional phrase and sentence • Rules • Rules consist of a single non-terminal on the left and any number of terminals and non-terminals on the right. • Describe the allowed structure of the constituents • Express the ways in which symbols of the language can be grouped or ordered together
S nonterminal symbols = constituents NP likes NP boy girl DetP DetP the a terminal symbols = words Phrase Structure Tree • (((the/Det) boy/N) likes/V((a/Det) girl/N)) Phrase-structure tree
Context? • The notion of context in CFGs is not the same as the ordinary meaning of the word context in language. • All it really means is that the non-terminal on the left-hand side of a rule is out there all by itself • A -> B C • Means that I can rewrite an A as a B followed by a C regardless of the context in which A is found
CFG: Example • Many possible CFGs for English, here is an example (fragment): • S NP VP • VP V NP • NP DetP N | AdjP NP • AdjP Adj | Adv AdjP • N boy | girl • V sees | likes • Adj big | small • Adv very • DetP a | the the very small boy likes a girl
Derivations in a CFG S NP VP VP V NP NP DetP N | AdjP NP AdjP Adj | Adv AdjP N boy | girl V sees | likes Adj big | small Adv very DetP a | the S S
Derivations in a CFG S NP VP VP V NP NP DetP N | AdjP NP AdjP Adj | Adv AdjP N boy | girl V sees | likes Adj big | small Adv very DetP a | the NP VP S NP VP
Derivations in a CFG S NP VP VP V NP NP DetP N | AdjP NP AdjP Adj | Adv AdjP N boy | girl V sees | likes Adj big | small Adv very DetP a | the DetP N VP S NP VP DetP N
Derivations in a CFG S NP VP VP V NP NP DetP N | AdjP NP AdjP Adj | Adv AdjP N boy | girl V sees | likes Adj big | small Adv very DetP a | the the boy VP S NP VP DetP N boy the
Derivations in a CFG S NP VP VP V NP NP DetP N | AdjP NP AdjP Adj | Adv AdjP N boy | girl V sees | likes Adj big | small Adv very DetP a | the the boy likes NP S NP VP DetP N V NP the boy likes
Derivations in a CFG S NP VP VP V NP NP DetP N | AdjP NP AdjP Adj | Adv AdjP N boy | girl V sees | likes Adj big | small Adv very DetP a | the the boy likes a girl S NP VP DetP N V NP the boy likes N DetP girl a
Generativity • We can view these rules as either analysis or synthesis machines • Generate strings in the language • Reject strings not in the language • Impose structures (trees) on strings in the language
A CFG defines a formal language • Sentences (strings of words) that can be derived by the grammar are in the formal language defined by the grammar • Sentences that cannot be derived from the grammar are not in the language • Ungrammatical
Derivations • A derivation is a sequence of rules applied to a string that accounts for that string • Covers all the elements in the string • Covers only the elements in the string
Recursion • We’ll have to deal with rules such as the following where the non-terminal on the left also appears somewhere on the right (directly). • NP -> NP PP [[The flight] [to Boston]] • VP -> VP PP [[departed Miami] [at noon]]
Recursion • Of course, this is what makes syntax interesting • flights from Denver • Flights from Denver to Miami • Flights from Denver to Miami in February • Flights from Denver to Miami in February on a Friday • Flights from Denver to Miami in February on a Friday under $300 • Flights from Denver to Miami in February on a Friday under $300 with lunch
The Point • If you have a rule like • VP -> V NP • It only cares that the thing after the verb is an NP. It doesn’t have to know about the internal affairs of that NP
The Point • VP -> V NP • I hate • flights from Denver • Flights from Denver to Miami • Flights from Denver to Miami in February • Flights from Denver to Miami in February on a Friday • Flights from Denver to Miami in February on a Friday under $300 • Flights from Denver to Miami in February on a Friday under $300 with lunch
Potential Problems in CFG • Agreement • Subcategorization • Movement
Agreement • By agreement, we have in mind constraints that hold among various constituents that take part in a rule or set of rules • For example, in English, determiners and the head nouns in NPs have to agree in their number. *This flights *Those flight This flight Those flights
Problem • Our earlier NP rules are clearly deficient since they don’t capture this constraint • NP Det Nominal • Accepts, and assigns correct structures, to grammatical examples (this flight) • But its also happy with incorrect examples (*these flight) • Such a rule is said to overgenerate.
Verb Phrases • English VPs consist of a head verb along with 0 or more following constituents which we’ll call arguments.