370 likes | 526 Views
LING 581: Advanced Computational Linguistics. Lecture Notes January 30th. Relative clause constructions. Terminology g ap (__): indicates where the head of the construction is interpreted Subject RC: the man ( that|who ) __ saw me Object RC: the man ( that|who ) I saw __
E N D
LING 581: Advanced Computational Linguistics Lecture Notes January30th
Relative clause constructions • Terminology • gap (__): • indicates where the head of the construction is interpreted • Subject RC: the man (that|who) __ saw me • Object RC: the man (that|who) I saw __ • Subject and object RCs can appear in subject and object positions freely: • The man that saw me left the room • The man that I saw left the room • I saw the man that saw me • I again saw the man that I saw Note: the relative pronoun is the that/who/which
Relative clause constructions • Terminology contd.: • Infinitival/untensedvs. tensed • John saw Mary (tensed) • John sees Mary (tensed) • John to see Mary (untensed) • In RC constructions: • the man to see Mary • a person to see • a time to go see Mary Note: subject is always missing… But it’s not always the RC gap
Relative clause constructions • Terminology contd.: • Zero refers to a missing relative pronoun • Zero RCs: • the man I saw (tensed) • the man to see (untensed) • *Zero: • *the man saw me / the man who saw me • *the man was seen by me / the man who was seen by me • The horse raced past the barn fell • must be zero: • *a person that to see • *the man that to see Mary
Homework Exercise Frequency counts
Homework Exercise Review • Use tregex to search for relative clauses as defined in Parsing Guidelines section 4.2.2: • zerorelative clauses
Homework Exercise Review • Use tregex to search for relative clauses as defined in Parsing Guidelines section 4.2.2: • zerorelative clauses
Homework Exercise Review • Use tregex to search for relative clauses as defined in Parsing Guidelines section 4.2.2: • infinitival relative clauses
Homework Exercise Review • Use tregex to search for relative clauses as defined in Parsing Guidelines section 4.2.2: • infinitival relative clauses
Homework Exercise Review • Use tregex to search for relative clauses as defined in Parsing Guidelines section 4.2.2: • infinitival relative clauses
Homework Exercise Review • From page 17:
Homework Exercise Review • Use tregex to search for relative clauses as defined in Bracketing Guidelines (prsguid1.pdf) section 4.2.2: • wh- and that- relative clauses Two subtypes: WHNP NP-trace WHADVP ADVP-trace Note: the format in the guide doesn’t always match exactly with WSJ trees … -NONE-
Homework Exercise Review • Use tregex to search for relative clauses as defined in Bracketing Guidelines (prsguid1.pdf) section 4.2.2: • wh- and that- relative clauses 1. 2. 3. MatchesPattern 11598 @NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i) 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i) << (@NP < (/^-NONE-$/ < /^\*T\*-([0-9]+)$/#1%i)))
Homework Exercise Review • Browsing through the matches and refining the search is always a good idea … to see what we have inadvertently picked up or have not thought of
Homework Exercise Review • Note: 2nd matching tree has an intervening PP:
Homework Exercise Review • Note: 5th matching tree has an intervening PP: Note: intervening punctuation is also common The plant, which is owned by Hollingsworth & Vose Co., was under contract …
Homework Exercise Review • @NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i) Note: *ICH* non-subject relative clause Note: the SBAR from NP-SBJ was extraposedto the VP
Homework Exercise Review • @NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i) This is NOT a relative clause construction!
Homework Exercise Review • @NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i) The relative clause gap here is ADVP Infinitival/non-tensed clause
Homework Exercise Review • @NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i) *ICH* subject relative clause Note: the SBAR from the NP object was right extraposed to the VP
Homework Exercise Review • @NP < NP < SBAR 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i) Coordination SBAR SBAR CC SBAR
Homework Exercise Review • 9028 @NP < NP < (SBAR < /^WHNP-([0-9]+)$/#1%i) • 10290 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i) • Excludes *ICH* cases • Excludes coordination …
Homework Exercise Review • 10290 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i) • 10326 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i << (/^(NP|ADVP)/ < (/^-NONE-$/ < /^\*T\*-([0-9]+)$/#1%i)))
Homework Exercise Review • 8575 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i << (NP-SBJ < /^-NONE-$/)) • 5975 @NP < NP < (SBAR < /^WH(NP|ADVP)-([0-9]+)$/#2%i << (NP-SBJ < (/^-NONE-$/ < /^\*T\*-([0-9]+)$/#1%i)))
Homework Exercise Review Let’s look at the *ICH* subcases:
Homework Exercise Review 159 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/))
Homework Exercise Review 159 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/)) This is NOT a relative clause construction!
Homework Exercise Review • @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/)) 155 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : /^SBAR-([0-9]+)$/#1%i Only 1 out of the 4 is NOT a relative clause construction!
Homework Exercise Review • @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*/)) 155 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : /^SBAR-([0-9]+)$/#1%i Search string is too restrictive: SBAR-PRP SBAR-NOM
Homework Exercise Review • 116 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : (/^SBAR.*-([0-9]+)$/#1%i < /^WH(NP|ADVP)-([0-9]+)$/) • 115 @NP < NP < (SBAR < (/^-NONE-$/ < /^\*ICH\*-([0-9]+)/#1%i)) : (/^SBAR.*-([0-9]+)$/#1%i < /^WH(NP|ADVP)-([0-9]+)$/#2%j << /\*T\*-([0-9]+)/#1%j) Not a trace? BUG?
Relevance of Treebanks • Statistical parsers typically construct syntactic phrase structure • they’re trained on Treebank corpora like the Penn Treebank • Note: some use dependency graphs, not trees
Parsers trained on the Treebank • Don’t recover fully-annotated trees • not trained using nodes with indices or empty (-NONE-) nodes • not trained using functional tags, e.g. –SBJ • Therefore they don’t fully parse • Example: no SBAR node in … a movie to see Stanford parser
Parsers trained on the Treebank • SBAR can be forced by the presence of an overt relative pronoun, but note there is no subject gap:
Parsers trained on the Treebank • Probabilities are estimated from frequency information of each node given surrounding context (e.g. parent node, or the word that heads the node) • Still these systems have enormous problems with prepositional phrase (PP) attachment • Example: (borrowed from Igor Malioutov) • A boy with a telescope kissed Mary on the lips • Mary was kissed by a boy with a telescope on the lips • PP with a telescope should adjoin to the noun phrase (NP) a boy • PP on the lips should adjoin to the verb phrase (VP) headed by kiss
Active/passive sentences • Examples using the Stanford Parser: Both active and passive sentences are parsed incorrectly
Active/passive sentences • Examples: Xon the lips modifies Mary Xon the lips modifies telescope
Homework Exercise • Use tregex to find out how many passive sentences there are in the Treebank WSJ section? • The passive construction (according to the Bracketing Guidelines) • Note: by-phrase containing logical subject (LGS) is optional