450 likes | 548 Views
自然言語処理 2010. 東京工科大学 コンピュータサイエンス学部 亀田弘之. 今日の話題. 研究レベルの話題に挑戦! 機械学習(帰納論理プログラミングの紹介). 研究レベルの話題に挑戦!. Pacling2009 での研究発表を理解してみよう! 皆さんはもうこのレベルの話しを理解できます。 批判的に聞いてください。 自分なりのアイデアを得てください。 (アイデアを得たらそれをもとに NLP の研究をしよう!).
E N D
自然言語処理2010 東京工科大学 コンピュータサイエンス学部 亀田弘之
今日の話題 • 研究レベルの話題に挑戦! • 機械学習(帰納論理プログラミングの紹介)
研究レベルの話題に挑戦! • Pacling2009 での研究発表を理解してみよう! • 皆さんはもうこのレベルの話しを理解できます。 • 批判的に聞いてください。 • 自分なりのアイデアを得てください。(アイデアを得たらそれをもとにNLPの研究をしよう!)
Unknown Word Acquisitionby Reasoning Word Meanings Incrementally and Evolutionarily Hiroyuki Kameda Tokyo University of Technology Chiaki Kubomura Yamano College of Aesthetics
Overview • Research background • Basic ideas of knowledge acquisition • Demonstrations • Concludings
Exchange of Information & Knowledge Language etc. natural & smooth communication
unknown words new social events natural, smooth and flexible communication Dialogue (Various Topics) Highly upgraded NLP technology yougers’ words New products release Unknown word Processing Professional Slangs
NLP System Text: Dogs ran fast. Syntactic structure: S-----NP-----N-----dogs | ----VP----V----ran | --Adv---fast
More concretely to say… Program(Source codes) (Knowledge of objects to be processed) Processing rule acquisition trigger NLP Engine Internal representation sentence trigger Lexicon (Domain Knowledge) UW Acquisition
Further more concretely to say… Rule-based grammar+α (Target Knowledge) Syntactic rule acquisition(by ILP) Batch NLP Engine (PrologInterpreter) Internal representation sentence Failure-trigger Lexicon (Domain Knowledge) UW Acquisition
Further more concretely to say…2 Rule-based grammar+α (Target Knowledge) Syntactic rule acquisition(by ILP) Batch NLP Engine (PrologInterpreter) Internal representation sentence Failure-trigger Lexicon (Domain Knowledge) UW Acquisition
Let’s consider a grammar. G1 = { Vn, Vt, s, P }, where Vn = {noun, verb, adverb, sentence}, Vt = {dogs, run, fast}, s = sentence, P = { sentence → noun + verb + adverb, noun→dogs, verb→run, adverb →fast}. ○ dogs run fast. ×cars run fast.
Let’s consider a grammar. Grammar G1 = { Vn, Vt, s, P }, where Vn = {noun, verb, adverb, sentence}, Vt = {dogs, run, fast}, s = sentence, P = { sentence → noun + verb + adverb, noun→dogs, verb→run, adverb →fast}. ○ dogs run fast. ×cars run fast. Cannot unify!cars <=!=> noun
Our ideas • Processing modes • Processing strategies
Processing Modes ◎ × ×
Processing Modes ◎ × ○ dogs run fast. ×cars run fast. ×
Adopted Processing Strategies • Parse a sentence in mode-1 at first. • If parsing fails, then switch the processing mode from mode-1 to mode-2.
Grammar G1 in Prolog Syntactic rule sentence (A,D):-noun (A,B), verb (B,C), adverb (C,D). noun([dogs|T],T). verb([run|T],T). adverb([fast|T],T). G1 = { Vn, Vt, s, P }, where Vn = {noun, verb, adverb, sentence}, Vt = {dogs run fast.}, s = sentence, P = { sentence → noun + verb + adverb, noun→dogs, verb→run, adverb →fast}. Lexicon New processing rule noun(AT,T) :- write(‘Unknown word found!‘).
References • Kameda, Sakurai and Kubomura:ACAI’99 Machine Learning and Applications, Proceedings of Workshop W01: Machine learning in human language technology, pp.62-67(1999). • Kameda & Kubomura:Proc. of Pacling2001, pp.146-152(2001).
Example sentence Tom broke the cup with the hammer. (Okada1991) tom broke the cup with the hammer
Grammatical settings G2 = <Vn, Vt, P, s> Vn = { s, np, vp, prpn, v, prp, pr, det, n }, Vt = { broke, cup, hammer, the, tom, with } P = { s -> np,vp. np -> prpn. vp -> v,np,prp. prp -> pr,np. np -> det,n. prpn -> tom. V -> broke. Det -> the. n -> cup. pr -> with. n -> hammer. } s:start symbol
Prolog version of Grammar G2 /* Syntactic rules*/ s(A, C, s( _np, _vp ), a1( _act, _agt, _obj, _inst )) :- np( A, B, _np, sem( _agt ) ), vp( B, C, _vp, sem( _act, _agt, _obj, _inst )). np(A, B, np( _prpn ), sem( _ )) :- prpn(A, B, _prpn, sem( _ )). vp(A, D, vp( _v, _np, _prp ), sem( Act, Agt, Obj, Inst )) :- v( A, B, _v, sem( Act, Agt, Obj, Inst )), np( B, C, _np, sem( Obj )), prp( C, D, _prp, sem( Inst )). vp(A, C, vp( _v, _np ), sem(Act,Agt,Obj,Inst) ) :- v(A, B, _v, sem( Act, Agt, Obj, Inst ) ), np(B, C, _np, sem( Obj )). prp(A, C, prp( _pr, _np ), sem( Z )) :- pr(A, B, _pr, sem( _ ) ), np(B, C, _np, sem( Z )). np(A, C, np( _det, _n ), sem( W )) :- det(A, B, _det, sem( _ ) ), n(B, C, _n, sem( W )). /* Lexicon*/ prpn( [tom|T], T, prpn(tom), sem(human) ). v([broke|T],T, v1(broke), sem(change, in_shape, human, thing, tool) ). det([the|T], T, det(the), sem( _ ) ). n( [cup|T], T, n(cup), sem(thing) ). pr( [with|T], T, pr(with), sem( _ )). n( [hammer|T], T, n(hammer), sem(tool) ).
Demonstration(Mode1) • Input 1:[tom,broke,the,cup,with,the,hammer] • Input 2:[tom,broke,the,glass,with,the,hammer]
Problem • Parsing fails, when unknown words exist in sentences.
Unknown word Processing • Switching processing modes(from Mode-1 to Mode-2) • When fails, switch the processing mode from mode-1 to mode-2. • Execute the predicate assertof Prolog to change the mode.
Demonstration(Mode2) • Input: • P1: [tom,broke,the,cup,with,the,hammer] • P2: [tom,broke,the,glass,with,the,hammer] • P3: [tom,broke,the,glass,with,the,stone] • P4: [tom,vvv,the,glass,with,the,hammer]
Problem • Leaning is sometimes imperfect. • Learnig order influences learnig results. • Solution:Influence of learning order is covered with introducing a function of evolutionary learning
More Explanations • All information of unknown words should be guessed, when the unknown words are registered to lexicon. spelling and POS are guessed, but not pronunciation. (imperfect knowledge) • If the pronunciation can be guessed later, the information will be added to lexicon.→ Evolutionary Learning!
Solution • Setting(some knowledge may be revised but some must not) • a priori knowledge (Initial Knowledge):must not change • posterior knowledge(Acquired Knowledge): • Must not change, if perfect • May change, if imperfect
Demonstration of Final version • Input: • P4: [tom,vvv,the,glass,with,the,hammer] • P2: [tom,broke,the,glass,with,the,hammer] • P3: [tom,broke,the,glass,with,the,stone]
Concludings • Research background • Basic ideas of knowledge acquisition • Some models • Information processing model • Unknown word acquisition model • Modes and Strategies • Demonstrations
Future Works • Applying to more real world domain • Therapeutic robots • Robot for schizophrenia rehabilitation
機械学習 • ILPの紹介