1 / 44

FLOW: A First-Language-Oriented Writing Assistant System

FLOW: A First-Language-Oriented Writing Assistant System. Mei-Hua Chen*, Shih-Ting Huang+, Hung-Ting Hsieh*, Ting-Hui Kao+, Jason S. Chang+ * Institute of Information Systems and Applications + Department of Computer Science

lizina
Download Presentation

FLOW: A First-Language-Oriented Writing Assistant System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FLOW: A First-Language-Oriented Writing Assistant System Mei-Hua Chen*, Shih-Ting Huang+, Hung-Ting Hsieh*, Ting-Hui Kao+, Jason S. Chang+ * Institute of Information Systems and Applications + Department of Computer Science National Tsing Hua University HsinChu, Taiwan, R.O.C. 30013 ACL 2012

  2. Feature • First-Language-Oriented • Translations • Paraphrases • N-grams (N=5)

  3. Introduction • composing stage We propose a method to ” 解決問題“. solve the problem tackle the problem • revising stage We propose a method to solve the problem 盡力 try our best do our best

  4. Translation-based N-gram Prediction • {e1, e2, …em, f1, f2 …fn} • predict the possible translations (Och and Ney, 2003) bilingual phrase alignments 2. disambiguous (correct the alignment error) ex. ...on ways to identify tackle 洗錢 money laundering money His forum entitled money laundry

  5. Paraphrase Suggestion • {e1, e2,…ek} • pivot-based method proposed by Bannard and Callison-Burch (2005).

  6. Experiment • Training data: Hong Kong Parallel Text (2,220,570 Chinese-Englishsentence pairs) • 10 Chinese sentences • two students to translate the Chinese sentences to English sentences using FLOW

  7. Result • Paraphrase performance well • N-gram tends to produce shorter • phrases

  8. Exploration of Term Dependence in Sentence Retrieval Keke Cai, Jiajun Bu, Chun Chen, Kangmiao Liu College of Computer Science, Zhejiang University Hangzhou, 310027, China ACL 2007

  9. Sentence Retrieval • Limited information • Application: • document summarization • question answering • novelty detection

  10. Term Dependence • Query:{Everest, highest , mountain} • Q ={TS1, TS2, …, TSn} • Term combinations:{Everest highest, highest mountain, Everest mountain} • further evaluated in each retrieved sentence • Ex. Everest is the highest mountain

  11. MINIPAR • a dependency parser • Ex. Everest is the highest mountain • :{Everest highest, highest mountain, Everest mountain} Distance=(3+1+2)/3

  12. Association Strength : Size of D( ) :

  13. Discussion • Query:{ Everest, highest , mountain} • TS1:{ Everest, highest , mountain} TS2:{ highest , mountain} AS(TS1, S1)= 0.5^(1/3)*0.5^2=0.1984 AS(TS2, S2)= 0.5^(1/2)*0.5^1=0.35355 • Dependency distance tend to small set pairs

  14. Experiments • Testing data: TREC novelty track 2003 and 2004 • Average precision of each different retrievalmodels

  15. Paraphrasing with Bilingual Parallel Corpora Colin Bannard , Chris Callison-Burch School of Informatics University of Edinburgh 2 Buccleuch Place Edinburgh, EH8 9LW ACL 2005

  16. Parallel Corpora • Monolingual • Bilingual (German-English)

  17. Bilingual Parallel Corpora • much more commonly available resource • one language can be identified using a phrase in another language as a pivot. German is pivot, use it to find English phrase

  18. Paraphrases • Application multidocument summarization machine translation question answering

  19. Aligning phrase pairs • statistical machine translation • phrase alignment • Och and Ney(2003) Giza++

  20. Assigning probabilities : original English phrase : candidate English phrase : foreign language phrase

  21. Experimental Design1 • 46 English phrases (occurred multiple times in the first 50,000 sentences) • Corpus: German-English section of the Europarl corpus (1,036,000 German-English sentence pairs) • Manually aligned • 289 evaluation sets (each contain 2~10) • Judgment: (meaning and grammar) two native English speakers • Precision: 0.605

  22. Experimental Design2 • evaluated the accuracy of top ranked paraphrases • conditions 1. manual alignments 2. automatic alignments 3. automatic alignments & multiple corpora in different languages (French-English, Spanish-English, Italian-English) (4,000,000 sentence pairs) 4. re-ranking 5. limited to the same sense

  23. trigram language model Ignore Grammar

  24. Image Search by Concept Map Hao Xu† Jingdong Wang‡ Xian-Sheng Hua‡ Shipeng Li‡ †MOE-MS KeyLab of MCC, University of Science and Technology of China, Hefei, 230026, P. R. China ‡Microsoft Research Asia, Beijing 100190, P. R. China SIGIR 2010

  25. Image search schemes

  26. Flowchart

  27. Visual Instance Transformation • text-based image search (Top 50) • affinity propagation (AP) clustering algorithm • sort the obtained centers in a descending order of their groups sizes

  28. Visual Instance • snoopy Side view Front view

  29. Spatial Intention Estimation • position • influence scope • Use 2D Gaussian distribution

  30. Layout Sensitive Relevance Evaluation • Sum up the relevance score for each concept • Appearance consistency -the count of common visual words • Spatial consistency -desired spatial distribution of the concept k -spatial distribution of visual instance v in the image

  31. Quantitative Search Performance

  32. Visual Results (1)

  33. Visual Results (2)

  34. User Study • participants : 20 college students • To the question “have you ever had any image search intentionconcerning the concept layout?” • 20% of respondents replied with“yes” and 50% of respondents replied with “no, but probably in the future”.

  35. Modeling Higher-Order Term Dependencies in Information Retrieval using Query Hypergraphs Michael Bendersky , W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR 2012

  36. Feature • a more accurate modeling of the dependencies • between the query terms • Query concepts • n-grams, term proximities, noun phrases, • named entities • verbose natural language queries • (grammatical complexity)

  37. Example • Provide information on the use of dogs worldwide for law • enforcement purposes. sequential dependence model (dog, “law enforcement”) (information, “lawenforcement”)

  38. Hypergraph structure Query: “ international art crime “

  39. Evaluation

More Related