320 likes | 428 Views
Fact Extraction using CNL: summary of reasoning (v3). David Mott, Dave Braines(ETS, IBM UK) Stephen Poteet (Boeing) February 2012. Objective of reasoning. Define rules for semantic reasoning entities situations Standardise processing of NL and CNL to use the same concepts and rules
E N D
Fact Extraction using CNL:summary of reasoning (v3) David Mott, Dave Braines(ETS, IBM UK) Stephen Poteet (Boeing)February 2012
Objective of reasoning • Define rules for semantic reasoning • entities • situations • Standardise processing of NL and CNL to use the same concepts and rules • NL processing currently done by basic rules in two agents • CNL processing currently done by "linguistic frames" • Two different rule interpreters, but use same rules
Conceptual Model(s) meaning expresses conceptualises thing symbol stands for "Our" Semiotic Triangle, based on the original [Ogden, C. K. and Richards, I. A. (1923). ]
Current NL Processing SYNCOIN Reports Our focus is on the semantics of the conceptual model Message PreProcessor Proper Nouns (places, units) Stanford Parser Names Entity Extractor Situation Extractor CEStore CE Aggregator "Stylistic" CE Conceptual Model (concepts, logical rules, linguistic expression) For Analysis
Meta facts – the hard stuff! • Conceptualise statements: conceptualise a ~ person ~ P. conceptualise the person P ~ is married to ~ the person P1. • can be written in "meta facts", about the concepts themselves: there is an entity concept named 'person'. there is a relation concept named 'is married to' that has the entity concept 'person' as domain and has the entity concept 'person as range. the relation concept 'is married to' has as domain has as range the entity concept 'person' • These meta facts can be used to talk about the concepts: • the conceptual model m1 contains the entity concept 'person'. • the relation concept 'is married to' is a symmetric relation. • or to map between words and concepts: • the verb '|marry|' expresses the relation concept 'is married to'.
Meta facts and object facts • Most "normal" facts are not meta facts: • the person John is married to the person Jane. • Sometimes we need to bridge the world of meta facts and "normal" (object) facts object facts about the things and relations that exist meta facts about the things and relations that exist words in a sentence magic mapping the verb phrase has the verb |marry| as head. the person John is married to the person Jane. the relation concept 'is married to' has .... the noun phrase has the noun |person| as head and |John| as dependent. What do we put in here?
Meta facts – Mapping entities Meta level facts Object level facts magic mapping the thing John realises the entity concept person the thing John is a person the entity concept 'person' the thing John realises is a person Meta rule: if ( the thing T realises the entity concept EC ) then ( the thing T is a < EC> )
Meta facts – Mapping relations Meta level facts Object level facts magic mapping the relation concept 'is married to' has the sequence ( the person John , and the person Jane ) as relation realisation the person John is married to the person Jane. the relation concept 'is married to' relation realisation the sequence "1" "2" the person John the person Jane is married to Meta rule: if ( the relation concept RC has the sequence ( the thing T , and the thing T2 ) as relation realisation ) then ( the thing T <RC > the thing T2 )
Meta facts – Mapping attributes Meta level facts Object level facts magic mapping the attribute concept 'sister' has the sequence ( the person John , and the person Jane ) as attribute realisation. the person John has the person Jane as sister the attribute concept 'sister' attribute realisation the sequence "1" "2" the person John the person Jane has as sister Meta rule: if ( the attribute concept AC has the sequence ( the thing T , and the thing T2 ) as attribute realisation ) then ( the thing T has the thing T2 as < AC >)
Add meta syntax to CE rules? if ( the word W expresses the entity concept EC ) and .... then ( the thing T is a < EC> ). if ... then ( the thing T <RC > the thing T2 ). Magic mapping occurs in the rule interpreter (need to define semantics) if ... then ( the thing T has the thing T2 as < AC >
Logic of Entities "the patrol in East Rashid discovers the facility." the noun phrase np1 has as head the noun |patrol| stands for Analyst's helper expresses the entity concept 'patrol unit' the thing s1 realises is a patrol unit [ nn_cat_ent_1 ] if ( the noun phrase NP has the noun N as head and stands for the thing T ) and ( the noun N expresses the entity concept C ) then ( the thing T realises the entity concept EC ) .
Proper Names the noun phrase np1 has as proper name head A "common name" defines a "well known" name that may be used when viewing the output CE as the name of the entity (but care is needed as uniqueness will only be within a certain context) the proper noun |East Rashid| stands for has as common name the thing s1 [ nn_comname ] if ( the noun phrase NP stands for the thing T and has the proper noun N as proper name head ) then ( the thing T has the value N as common name ) .
Adjectives "the Christian market" the noun phrase np1 has as dependent Handled similarly to nouns, but currently requires the conceptual model to contain a noun form of the adjective e.g. "christian entity" the adjective |Christian| stands for Analyst's helper expresses the entity concept 'christian entity' the thing s1 realises is a christian entity [ nn_cat_ent_2 ] if ( the noun phrase NP has the word W as dependent and stands for the thing T ) and ( the word W expresses the entity concept C ) then ( the thing T realises the entity concept EC) . Needs more work here
Containers the noun phrase np1 "the patrol in East Rashid discovers the facility." has as dependent the prepositional phrase pp1 stands for has as head has as object the preposition |in| the noun phrase np2 stands for the patrol unit p1 the thing t2 is contained in is a container [ nn_prep_in ] if ( the noun phrase NP has the prepositional phrase PP as dependent and stands for the thing T ) and ( the prepositional phrase PP has the preposition '|in|' as head and has the noun phrase NP1 as object ) and ( the noun phrase NP1 stands for the thing T1 ) then ( the thing T1 is a container ) . [ nn_prep_in_1 ] if ( the noun phrase NP has the prepositional phrase PP as dependent and stands for the thing T ) and ( the prepositional phrase PP has the preposition '|in|' as head and has the noun phrase NP1 as object ) and ( the noun phrase NP1 stands for the container T1 ) then ( the thing T is contained in the container T1 ) .
"Same as" processing • Two things may be determined to be the same entity, in which case their properties and relations are cross propagated if ( the thing T is the same as the thing T1 ) and ( the thing T is an < EC > ) then ( the thing T1 is an < EC > ). if ( the thing T is the same as the thing T1 ) and ( the thing T < RC> the thing T3 ) then ( the thing T1 < RC > the thing T3 ). if ( the thing T is the same as the thing T1 ) and ( the thing T has V as < AC > ) then ( the thing T1 has V as < AC > ) . A meta statement of propagation Sameas inference is implemented in teh Prolog rule engine, but leads to large number of inferences; may need a better implementation
Common Names Things with the same common name are the same thing (This is an assumption that common name is unique) if ( there is a thing named T that has the proper noun PN as common name ) and ( there is a thing named T1 that has the proper noun PN as common name ) and ( the thing T # the thing T1 ) then ( the thing T is the same as the thing T1 ) . Proper Names This is the way to identify the entities in noun phrases as already-known places/people/organisations etc, using a preexisting set of common names there is a place named 1234 that has the proper noun |East Rashid| as common name and has '32,33' as coordinates and is located in the place |Afghanistan|. .
Places Factbase of Names there is a place named 1234 that has the proper noun |East Rashid| as common name and has '32,31' as coordinates. the thing t1 has the proper noun |East Rashid| as common name. sameas processing the patrol unit p1 is contained in the container t1 the thing t1 is a place the patrol unit p1 is located in the place t1 [ place_1 ] if ( the thing T is contained in the container P ) and ( the container P is a place ) then ( the thing T is located in the place P ) .
Specific ACM semantics [ attack_perp_1 ] if ( the attack A has the agent A1 as agent role ) then ( the attack A has the agent A1 as perpetrator ) . [ attack_targ ] if ( the attack A has the thing A1 as patient role ) then ( the attack A has the thing A1 as target ) . [ discovery_finder_1 ] if ( the discovery D has the agent A1 as agent role ) then ( the discovery D has the agent A1 as finder ) . [ discovery_find ] if ( the discovery D has the thing A1 as patient role ) then ( the discovery D has the thing A1 as find ) . This needs to be done for each concept, by the analyst Analyst's helper Is this really necessary – could it be handled by meta level rules based on the range and domain of the entity concepts? We would need to define a new relation in the conceptualise?
Logic of situations "the patrol in East Rashid discovers the facility." the verb phrase v1 has as head the verb |discover| Analyst's helper expresses "finds" is really a relation but the situation is an entity, so we need to reconcile these views, hence "is viewed relationally" stands for the relation concept 'finds' is viewed relationally as [ vb_sit ] if ( the verb phrase VB stands for the thing T ) then ( the thing T is a situation ) . [ vb_cat_ent ] if ( the verb phrase VB has the verb PT as head and stands for the situation T ) and ( the verb PT expresses the relation concept RC ) then ( the situation T is viewed relationally as the relation concept RC ) . the thing s1 is a situation
Logic of situations (2) the verb phrase v1 "the patrol in East Rashid discovers the facility." has as head the verb |discovers| Analyst's helper expresses stands for the relation concept 'finds' Analyst's helper reifies is viewed relationally as the entity concept 'discovery' realises the situation s1 is a discovery [ gen_reify ] if ( the situation S is viewed relationally as the relation concept RC ) and ( the entity concept EC reifies the relation concept RC ) then ( the situation S realises the entity concept EC ) .
Logic of situations (3) the sentence s1 "the patrol in East Rashid discovers the facility." has as head has as dependent the verb phrase v1 has as dependent has as head the noun phrase np1 the verb |finds| the noun phrase np2 expresses stands for the relation concept 'finds' stands for is viewed relationally as stands for reifies the entity concept 'discovery' [ vb_patient ] if ( the verb phrase VB has the noun phrase NP as dependent and stands for the situation VBT ) and ( the noun phrase NP stands for the thing NPT ) then ( the situation VBT has the thing NPT as patient role ) . [ vb_agent ] if ( the sentence phrase SP has the noun phrase NP as dependent and has the verb phrase VB as head ) and ( the noun phrase NP stands for the thing NPT ) and ( the verb phrase VB stands for the situation VBT ) then ( the situation VBT has the thing NPT as agent role ) . realises the discovery s1 has as agent role has as patient role the patrol p1 the facility f1
Logic of situations (4) the sentence s1 "the patrol in East Rashid discovers the facility." has as head has as dependent the verb phrase v1 has as dependent has as head the noun phrase np1 the verb |finds| the noun phrase np2 expresses stands for the relation concept 'finds' is viewed relationally as stands for reifies stands for the entity concept 'discovery' [ gen_relation ] if ( the situation S has the thing A as agent role and has the thing B as patient role and is viewed relationally as the relation concept RC ) then ( the relation concept RC has the sequence ( the thing A , and the thing B ) as relation instance ). [ gen-relation-domain ] if ( the situation S has the thing A as agent role and is viewed relationally as the relation concept RC ) and ( the relation concept RC has the entity concept EC as domain ) then ( the thing A realises the entity concept EC ). [ gen-relation-range ] if ( the situation S has the thing B as patient role and is viewed relationally as the relation concept RC ) and ( the relation concept RC has the entity concept EC as range ) then ( the thing B realises the entity concept EC ). realises the discovery s1 has as patient role has as agent role the patrol p1 finds the facility f1 is a agent
Processing for CNL and NL • NL processing: rules implemented as standard CE inference engine in agents running against CE store. • CNL processing: special purpose inference engine, interpreting CE logic in linguistic frames • SAME CE logic in both
"Identical" NL and CNL parsers stylistically expressive CE NLP Reference English Grammar CNL Parser NL Parser lexicon Semantic Theory conceptual model basic CE or predicate logic or CE-in-Java stylistically expressive CE Better understanding of linguistics Increase stylistic expressibility of CE
Linguistic Frame for semantics v(X), X=COMMON_np3,... there is a linguistic frame named np3 that has 'a person' as example and defines the noun phrase NP_np3 and has the sequence ( the determiner DET_np3 , and the noun COMMON_np3 ) as syntactic pattern and is predicated on the thing X and has the statement that ( the noun COMMON_np3 expresses the entity concept EC_np3 ) as preconditions and has the statement that ( the thing X realises the entity concept EC_np3 ) and ( the noun phrase NP_np3 stands for the thing X ) as semantic statement the word |a| belongs to the linguistic category 'determiner'. the word |person| is a noun. the word |person| expresses the entity concept person. noun phrase lambda variable syntax noun person(COMMON_np3) determiner a person semantics We want exactly the same logic here as in the real NL processing (cf earlier slide on Logic of Entities) Linguistic Model Analyst's Conceptual Model
CNL semantic processing The result is the set of CE facts representing the sentence Linguistic Frame for Sentence CE facts passed upwards from box to box Rules for Sentence truth box Linguistic Frame for VP Linguistic Frame for NP Rules for VP Rules for NP truth box truth box Word Category for Verb Linguistic Frame for NP the person John Concept Lookup Rules for NP truth box truth box NP processing not shown is married to the person Jane
Analyst Helper • Provides background linguistic information to NL and CNL parsers, specific to the ACM • meta information on concepts • generated automatically from the ACM • "expresses" relation between words and concepts • only analyst knows what the concepts mean • for each concept, ask analyst to say what words express it • can use Wordnet to make suggestions • rules to determine ACM specific relations • Sets of proper names • places, people, organisations,... • use (assumed unique) common name to identify • feedback words that have not been recognised when analysing sentences • needs interaction between parser and analyst helper • Basic user interface is needed • could be more elaborate if resources available
What we need from the analysts helper? • the word X expresses the concept Y. • the entity concept EC reifies the relation concept RC • rules, eg [ attack_perp_1 ] ? • there is an entity concept named EC. • the relation concept RC has the entity concept EC as domain and the entity concept EC1 as range. • the attribute concept RC has the entity concept EC as domain and the entity concept EC1 as range. • there is a place named 1234 that has the proper name |East Rashid| as common name and has '32,33' as coordinates... Analyst's helper conceptual model Proper Names • Issues: • do we name concepts by the user-visible "concept term"? [a detail] • the expresses type information seems too simplistic • at some stage we need far more detailed background semantics for applying semantic constraints to parsing and for disambiguation
Flow of information for NLP conceptual model MetaModel generator generator? meta information Analyst Helper semantic rules NL parser "expresses" Proper Names Analyst the word |xxx| is an unrecognised word wordnet/etc ITAnet gazetteers etc translate translate wordnet/etc gazetteers etc Actually the CE parser uses the same resources