100 likes | 291 Views
Multiple-Path Syntactic Analyzer. 2003. 10. 8 Dae-Won Park. Introduction. By S. KUNO, A. G. OETTINGER, 1962 Method of predictive syntactic analysis Obtaining a single most probable description of structure of an input sentence in a single left to right scan through the sentence
E N D
Multiple-Path Syntactic Analyzer 2003. 10. 8 Dae-Won Park
Introduction • By S. KUNO, A. G. OETTINGER, 1962 • Method of predictive syntactic analysis • Obtaining a single most probable description of structure of an input sentence in a single left to right scan through the sentence • Prediction pools : similar to a pushdown store • Difficulty of predictive analysis for handling complex sentence structures • There are many syntactically ambiguous sentences in natural texts • when a single path analysis comes to a dead end, determining which of the previous branch points was the cause of the failure • Lacking of an effective method for distinguishing paths -> not possible to try different paths in a systematic loop-free sequence
Introduction • Extending the predictive approach • By including effective provisions for multiple analysis of syntactically ambiguous sentences • Variable size prediction pool consisting of one or more subpools • Each subpool is pushdown store (stack) • Case : (k-1)st word has been processed • prediction pool contains a subpool for each sentence structure compatible with the first (k-1) words • topmost prediction of each subpool is tested against all the homographs of the k-th word • after the processing of the last word, tracing back the paths
Backgroud : Dictionary and Syntactic word classes • In the processing,each word of an input sentence • is looked up in a dictionary • is coded for membership in all syntactic word classes
Backgroud : Grammar Table • Defining grammatical matching function G of a language • is described in terms of • a set of predictions P,a set of syntactic word classes S,a set of syntactic role indicators R • Prediction : stands for a certain syntactic structure recognized in the language • G(Pi, Sj) = { [(p11, p12, , , , p1m), (r1)], [(p21, p22, , , , p2m), (r2)], , , [(pq1, pq2, , , , pqm), (rq)] } • [(Pi,Sj), G(Pi,Sj)] in G : rule of grammar • (Pi,Sj) = (SENTENCE, PRN)(PRN=personal pronoun in the nominative case)
Procedure for analysis Sample sentence : THEY ARE FLYING PLANES Init Store prediction of SENTENCE First word Pared with the syntactic word class(PRN) of the first word(THEY) Replace the initial prediction of SENTENCE to eight new prediction : by looking up grammar table ( table 2 ) Second word Three syntactic word classes (BE1, BE2, BE3) assigned to ARE Coupled with the topmost prediction of each of the eight subpools: 24 argument => (PREDICATE, BE1), (PREDICATE, BE2), , , (ADJECTIVE CLAUSE, BE1), , , (COMMA, BE1), , , All subpools except with PREDICATE are discarded All Subrules with PREDICATE have PREDICATE VERB (table 3) Analysis of a sentence(1/4)
Analysis of a sentence(3/4) • Three analyses for sample sentence (THEY ARE FLYING PLANES) • Table 4 • First : THEY refers to planes • Second : THEY refers to peoples • Third : not acceptable ( such as “The facts are smoking kills” ) • semantically correct • ill-formed • acceptable form : The facts are : smoking kills
Program & Conclusion • System : IBM 7090 • Limited core memory (within 32000 words) • A path is determined in part by the choice of a single homograph Sk for each word position k (k=1, 2, ,,, n ), n is the number of words in the sentence • Total number of distinct selection N = k=1 k (when the kth position has k homographs Sk ( k = 1, 2, , , k ) • Running time • 12 minutes for the analysis of 35-word sentence