自然言語処理 2010

自然言語処理2010 東京工科大学コンピュータサイエンス学部亀田弘之

今日の話題 • 研究レベルの話題に挑戦！ • 機械学習（帰納論理プログラミングの紹介）

研究レベルの話題に挑戦！ • Pacling2009 での研究発表を理解してみよう！ • 皆さんはもうこのレベルの話しを理解できます。 • 批判的に聞いてください。 • 自分なりのアイデアを得てください。（アイデアを得たらそれをもとにNLPの研究をしよう！）

Unknown Word Acquisitionby Reasoning Word Meanings Incrementally and Evolutionarily Hiroyuki Kameda Tokyo University of Technology Chiaki Kubomura Yamano College of Aesthetics

Overview • Research background • Basic ideas of knowledge acquisition • Demonstrations • Concludings

Exchange of Information & Knowledge Language etc. natural & smooth communication

A talking Robot with a human

unknown words new social events natural, smooth and flexible communication Dialogue （Various Topics） Highly upgraded NLP technology yougers’ words New products release Unknown word Processing Professional Slangs

Unknown Word Acquisition-Our Basic ideas-

Information Processing Model

NLP System Text: Dogs ran fast. Syntactic structure: S-----NP-----N-----dogs | ----VP----V----ran | --Adv---fast

Information Processing Model 2

Knowledge Acquisition Model

Knowledge Acquisiton System

More concretely to say… Program(Source codes) （Knowledge of objects to be processed） Processing rule acquisition trigger NLP Engine Internal representation sentence trigger Lexicon （Domain Knowledge） UW Acquisition

Further more concretely to say… Rule-based grammar＋α （Target Knowledge） Syntactic rule acquisition（by ILP） Batch NLP Engine （PrologInterpreter） Internal representation sentence Failure-trigger Lexicon （Domain Knowledge） UW Acquisition

Further more concretely to say…2 Rule-based grammar＋α （Target Knowledge） Syntactic rule acquisition（by ILP） Batch NLP Engine （PrologInterpreter） Internal representation sentence Failure-trigger Lexicon （Domain Knowledge） UW Acquisition

Main topics

Let’s consider a grammar. G1 = { Vn, Vt, s, P }, where Vn = {noun, verb, adverb, sentence}, Vt = {dogs, run, fast}, s = sentence, P = { sentence → noun + verb + adverb, noun→dogs, verb→run, adverb →fast}. ○　dogs run fast. ×cars run fast.

Let’s consider a grammar. Grammar G1 = { Vn, Vt, s, P }, where Vn = {noun, verb, adverb, sentence}, Vt = {dogs, run, fast}, s = sentence, P = { sentence → noun + verb + adverb, noun→dogs, verb→run, adverb →fast}. ○　dogs run fast. ×cars run fast. Cannot unify!cars <=!=> noun

Our ideas • Processing modes • Processing strategies

Processing Modes

Processing Modes ◎ × ×

Processing Modes ◎ × ○　dogs run fast. ×cars run fast. ×

Processing Strategies

Adopted Processing Strategies • Parse a sentence in mode-1 at first. • If parsing fails, then switch the processing mode from mode-1 to mode-2.

Grammar G1 in Prolog Syntactic rule sentence (A,D):-noun (A,B), verb (B,C), adverb (C,D). noun([dogs|T],T). verb([run|T],T). adverb([fast|T],T). G1 = { Vn, Vt, s, P }, where Vn = {noun, verb, adverb, sentence}, Vt = {dogs run fast.}, s = sentence, P = { sentence → noun + verb + adverb, noun→dogs, verb→run, adverb →fast}. Lexicon New processing rule noun(AT,T) :- write(‘Unknown word found!‘).

References • Kameda, Sakurai and Kubomura：ACAI’99 Machine Learning and Applications, Proceedings of Workshop W01: Machine learning in human language technology, pp.62-67(1999). • Kameda & Kubomura:Proc. of Pacling2001, pp.146-152(2001).

Let’s explain in more details!

Example sentence Tom broke the cup with the hammer. (Okada1991) tom broke the cup with the hammer

Grammatical settings G2 = <Vn, Vt, P, s> Vn = { s, np, vp, prpn, v, prp, pr, det, n }, Vt = { broke, cup, hammer, the, tom, with } P = { s -> np,vp. np -> prpn. vp -> v,np,prp. prp -> pr,np. np -> det,n. prpn -> tom. V -> broke. Det -> the. n -> cup. pr -> with. n -> hammer. } s:start symbol

Prolog version of Grammar G2 /* Syntactic rules*/ s(A, C, s( _np, _vp ), a1( _act, _agt, _obj, _inst )) :- np( A, B, _np, sem( _agt ) ), vp( B, C, _vp, sem( _act, _agt, _obj, _inst )). np(A, B, np( _prpn ), sem( _ )) :- prpn(A, B, _prpn, sem( _ )). vp(A, D, vp( _v, _np, _prp ), sem( Act, Agt, Obj, Inst )) :- v( A, B, _v, sem( Act, Agt, Obj, Inst )), np( B, C, _np, sem( Obj )), prp( C, D, _prp, sem( Inst )). vp(A, C, vp( _v, _np ), sem(Act,Agt,Obj,Inst) ) :- v(A, B, _v, sem( Act, Agt, Obj, Inst ) ), np(B, C, _np, sem( Obj )). prp(A, C, prp( _pr, _np ), sem( Z )) :- pr(A, B, _pr, sem( _ ) ), np(B, C, _np, sem( Z )). np(A, C, np( _det, _n ), sem( W )) :- det(A, B, _det, sem( _ ) ), n(B, C, _n, sem( W )). /* Lexicon*/ prpn( [tom|T], T, prpn(tom), sem(human) ). v([broke|T],T, v1(broke), sem(change, in_shape, human, thing, tool) ). det([the|T], T, det(the), sem( _ ) ). n( [cup|T], T, n(cup), sem(thing) ). pr( [with|T], T, pr(with), sem( _ )). n( [hammer|T], T, n(hammer), sem(tool) ).

Demonstration（Mode１） • Input 1：[tom,broke,the,cup,with,the,hammer] • Input 2：[tom,broke,the,glass,with,the,hammer]

Problem • Parsing fails, when unknown words exist in sentences.

Unknown word Processing • Switching processing modes（from Mode-1 to Mode-2） • When fails, switch the processing mode from mode-1 to mode-2. • Execute the predicate assertof Prolog to change the mode.

Demonstration（Mode2） • Input： • P1: [tom,broke,the,cup,with,the,hammer] • P2: [tom,broke,the,glass,with,the,hammer] • P3: [tom,broke,the,glass,with,the,stone] • P4: [tom,vvv,the,glass,with,the,hammer]

Problem • Leaning is sometimes imperfect. • Learnig order influences learnig results. • Solution：Influence of learning order is covered with introducing a function of evolutionary learning

More Explanations • All information of unknown words should be guessed, when the unknown words are registered to lexicon. spelling and POS are guessed, but not pronunciation. (imperfect knowledge) • If the pronunciation can be guessed later, the information will be added to lexicon.→ Evolutionary Learning!

Solution • Setting(some knowledge may be revised but some must not) • a priori knowledge (Initial Knowledge)：must not change • posterior knowledge(Acquired Knowledge): • Must not change, if perfect • May change, if imperfect

Demonstration of Final version • Input： • P4: [tom,vvv,the,glass,with,the,hammer] • P2: [tom,broke,the,glass,with,the,hammer] • P3: [tom,broke,the,glass,with,the,stone]

Concludings • Research background • Basic ideas of knowledge acquisition • Some models • Information processing model • Unknown word acquisition model • Modes and Strategies • Demonstrations

Future Works • Applying to more real world domain • Therapeutic robots • Robot for schizophrenia rehabilitation

次の話題は？

機械学習 • ILPの紹介

自然言語処理 2010

自然言語処理 2010

Presentation Transcript

2010 CPT

Incoterms 2010

Phoenix, AZ - 17 August 2010 Raleigh, NC – 19 August 2010

February 2010 CPI Fan Chart

Animated fan chart for GDP growth: August 2010

SAFETY AT SPORTS AND RECREATIONAL EVENTS ACT NO. 2 OF 2010

2010 HTC Updates