210 likes | 226 Views
Context-Free Parsing. Part 2 Features and Unification. Parsing with features. We need to constrain the rules in CFGs, for example to coerce agreement within and between constituents to pass features around to enforce subcategorisation constraints
E N D
Context-Free Parsing Part 2 Features and Unification
Parsing with features • We need to constrain the rules in CFGs, for example • to coerce agreement within and between constituents • to pass features around • to enforce subcategorisation constraints • Features can be easily added to our grammars • And later we’ll see that feature bundles can completely replace constituents
Parsing with features • Rules can stipulate values, or placeholders (variables) for values • Features can be used within the rule, or passed up via the mother nodes • Example: subject-verb agreement S NP VP [if NP and VP agree in number] number of NP depends on noun and/or determiner number of VP depends on verb S NP(num=X) VP (num=X) NP (num=X) det(num=X) n (num=X) VP(num=X) v(num=X) NP(num=?)
this man n(num=pl) men Declarative nature of features NP (num=X) det(num=X) n (num=X) The rules can be used in various ways • To build an NP only if det and n agree (bottom-up) • When generating an NP, to choose an n which agrees with the det (if working L-to-R) (top-down) • To show that the num value for an NP comes from its components (percolation) • To ensure that the num value is correctly set when generating an NP (inheritance) • To block ill-formed input this det (num=sg) these det (num=pl) the det (num=?) man n (num=sg) men n (num=pl) NP (num=sg) det(num=sg) n(num=sg)
the man Use of variables NP (num=X) det(num=X) n (num=X) • Unbound (unassigned) variables (ie variables with a free value): • the can combine with any value for num • Unification means that the num value for the is set to sg this det (num=sg) these det (num=pl) the det (num=?) man n (num=sg) men n (num=pl) NP (num=sg) det(num=?) n(num=sg)
Parsing with features • Features must be compatible • Formalism should allow features to remain unspecified • Feature mismatch can be used to block false analyses, and disambiguate • e.g. they can fish ~ he can fish ~ he cans fish • Formalism may have attribute-value pairs, or rely on argument position e.g. NP(_num,_sem) det(_num) n (_num,_sem) an = det(sing) the = det(_num) man = n(sing,hum)
Parsing with features VP v e.g. dance VP v NP e.g. eat VP v NP NP e.g. give VP v PP e.g. wait (for) • Using features to impose subcategorization constraints VP(_num) v(_num,intr) VP(_num) v(_num,trans) NP VP(_num) v(_num,ditrans) NP NP VP(_num) v(_num,prepobj(_case)) PP(_case) PP(_case) prep(_case) NP dance = v(plur,intr) dances = v(sing,intr) danced = v(_num,intr) waits = v(sing,prepobj(for)) for = prep(for)
S NP VP (_num) (_num) v (sing,intrans) v NP (sing,trans) (_1) det n (_num) (_num) det n (_1) (_1) shot the man those elephants Parsing with features (top-down) S NP(_num) VP(_num) NP(_num) det(_num) n(_num) VP(_num) v(_num,intrans) VP(_num) v (_num,trans) NP(_1) the man shot those elephants S NP(_num) VP(_num) NP(_num) det(_num) n(_num) the = det(_num) man = n(sing) (sing) (sing) _num=sing VP(sing) v(sing,intrans) shot = v(sing,trans) (sing) (pl) VP(sing) v(sing,trans) NP(_1) (sing) shot = v(sing,trans) NP(_1) det(_1) n(_1) (pl) (pl) those = det(pl) elephants = n(pl)
S NP (sing) VP (sing,trans) v (sing,trans) det (_num) n (sing) NP (pl) n (pl) the shot man det (pl) elephants those Parsing with features (bottom-up) S NP(_num) VP(_num) NP(_num) det(_num) n(_num) VP(_num) v(_num,intrans) VP(_num) v (_num,trans) NP(_1) the man shot those elephants the = det(_num) man = n(sing) shot = v(sing,trans) those = det(pl) elephants = n(pl) NP(_num) det(_num) n(_num) VP(_num) v (_num,trans) NP(_1) S NP(_num) VP(_num) (sing)
ATTR1 VAL1 ATTR2 VAL2 ATTR3 VAL3 NUM SG PERS 3 CAT NP NUMBER SG PERSON 3 CAT NP AGR Feature structures • Instead of attaching features to the symbols, we can parse with symbols made up entirely of attribute-value pairs: “feature structures” • Can be used in the same way as seen previously • Values can be atomic … • … or embedded feature structures …
AGR 1 SUBJ [ AGR 1 ] NUM SG PERS 3 Feature structures • … or they can be coindexed CAT S HEAD
AGR 1 SUBJ [ AGR 1 ] NUM SG PERS 3 Parsing with feature structures • Grammar rules can specify assignments to or equations between feature structures • Expressed as “feature paths” e.g. HEAD.AGR.NUM = SG CAT S HEAD
CAT NP NUMBER ?? PERSON 3 NUMBER SG PERSON 3 CAT NP NUMBER SG PERSON 3 Feature unification • Feature structures can be unified if • They have like-named attributes that have the same value: [NUM SG] [NUM SG] = [NUM SG] • Like-named attributes that are “open” get the value assigned: =
NUM SG PERS 3 CAT NP NUMBER SG CAT NP NUMBER SG PERSON 3 CAT NP AGR Feature unification • Complementary features are brought together • Unification is recursive • Coindexed structures are identical (not just copies): assignment to one effects all = [PERSON 3] CAT NP AGR [NUM SG] CAT NP AGR [PERS 3] =
VAL INDEF NUM SG Example CAT N AGR _2 SEM _3 CAT NP AGR _1 _2 SEM _3 CAT DET AGR _1 CAT DET AGR a CAT DET AGR [VAL DEF] the CAT N LEX “man” AGR [NUM SG] SEM HUM man
CAT DET AGR [VAL DEF] the CAT N LEX “man” AGR [NUM SG] SEM HUM man the man the the man CAT N AGR _2 SEM _3 CAT NP AGR _1 _2 SEM [_3] CAT DET AGR _1 LEX “man” AGR [NUM SG] SEM HUM VAL DEF [VAL DEF] NUM SG HUM
CAT DET AGR a VAL INDEF AGR NUM SG VAL INDEF NUM SG VAL INDEF NUM SG VAL INDEF NUM SG CAT N LEX “man” AGR [NUM SG] SEM HUM man a a man a man CAT NP AGR _1 _2 SEM [_3] CAT N AGR _2 SEM _3 CAT DET AGR _1 LEX “man” AGR [NUM SG] SEM HUM [NUM SG] HUM
Types and inheritance • Feature typing allows us to constrain possible values a feature can have • e.g. num = {sing,plur} • Allows grammars to be checked for consistency, and can make parsing easier • We can express general “feature co-occurrence conditions” … • And “feature inheritance rules” • Both these allow us to make the grammar more compact
Co-occurrence conditions and Inheritance rules • General rules (beyond simple unification) which apply automatically, and so do not need to be stated (and repeated) in each rule or lexical entry • Examples: [cat=np] [num=??, gen=??, case=??] [cat=v,num=sg] [tns=pres] [attr1=val1] [attr2=val2]
Inheritance rules • Inheritance rules can be over-ridden e.g. [cat=n] [gen=??,sem=??] sex={male,female} gen={masc,fem,neut} [cat=n,gen=fem,sem=hum] [sex=female] uxor [cat=n,gen=fem,sem=hum] agricola [cat=n,gen=fem,sem=hum,sex=male]
Unification in Linguistics • Lexical Functional Grammar • If interested, see PARGRAM project • GPSG, HPSG • Construction Grammar • Categorial Grammar