1 / 90

Paper & Pencil Interface 2000 년 3 월 30 일 KAIST 전산학과 인공지능 연구실 김 진 형 Jkim@cs.kaist.ac.kr

Paper & Pencil Interface 2000 년 3 월 30 일 KAIST 전산학과 인공지능 연구실 김 진 형 Jkim@cs.kaist.ac.kr http://ai.kaist.ac.kr/~jkim. 목 차. Paper & Pen Interface Pen Computers : Dream and Realization Pen Computing 요소 기술 Pen Computing 을 위한 인식기술 KAIST Approach Killer Applications

katima
Download Presentation

Paper & Pencil Interface 2000 년 3 월 30 일 KAIST 전산학과 인공지능 연구실 김 진 형 Jkim@cs.kaist.ac.kr

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Paper & Pencil Interface 2000년 3월 30일 KAIST 전산학과 인공지능 연구실 김 진 형 Jkim@cs.kaist.ac.kr http://ai.kaist.ac.kr/~jkim

  2. 목 차 • Paper & Pen Interface • Pen Computers : Dream and Realization • Pen Computing 요소 기술 • Pen Computing을 위한 인식기술 • KAIST Approach • Killer Applications • Innovative Applications • KAIST Pen-based Arithmetic Tutor • Conclusion

  3. "Paper and Pencil" Interface • 가장 자연스러운 Interface • (Probably) Between Computer and Humans • VDT, Keyboard, Mouse combination is limited • VDT syndrome : eyestrain, headaches, backaches, stiff necks, sore wrists • Mouse+keyboard vs Pen • Equation input • Drawing input • Large set Character input • 한자, 한글(?)

  4. Pen Computer • Pen is the major input device • optional keyboard • various size and shape • Aims vertical market • Not a general purpose device • Some are already on the market • Not a big success yet • No killer applications yet

  5. Pen Computer의 일반적인 사양 • 작고 가벼워서 들고 다닐 수 있다. • Mobile computing device, PDA • (전자)펜으로 글씨와 그림 작성, 메뉴 선택 • 무선 통신에 의한 정보 교환 • Handy internet terminal로 • MultiMedia 처리 • 글씨, 도형, 화상, 음성 • 값이 싸다.

  6. Long History of Research • Dreams • DynaBook - Alan Kay • Knowledge Navigator - Apple Co. • Tablet - U. of Illinois Undergrads • Projects • Pattern Information Processing (Japan) • Electronic Paper (ESPRIT) • IBM, Sony, etc.

  7. Samsung Penmaster 386

  8. Samsung Export Model 1994(?)

  9. PDA models

  10. NeoPoint • Read and send email • Access Internet • Popular PDA programs • Contact • Schedule • To do • Sync data with PC • Wake-up-call

  11. E-book : Rocket Book • Download book contents • Touch-Screen based interaction • Make note, annotation, book mark, under lining • No handwriting yet • Too expensive • No other usage • Need Internet Access

  12. E-book : everybook

  13. 전자부품연구원 ebook • 인터넷에 접속 정보검색 • 문서작성(?) • 8.2인치 터치 스크린 • 무게: 1㎏ • 배터리 사용시간 : 8시간 • PDA 4배크기 칼라 화면 • 저장 : 반도체 메모리 • 중소기업과 협동작업

  14. PenComputer 요소 기술 • Hardware 기술 • Flat Panel Display • Pen/Digitizer • MicroProcessor • Battery and Power management • Storage Devices • Packaging • Software 기술 • Wireless Communications • Operating System • Handwriting recognition • Utilities and Accessories • Applications

  15. Pen Computing을 위한 인식 문제 • 인식 대상 (online) • Menu 선택 • Characters (한글, 영문자, 한자, 숫자, 특수문자) • Drawings, Gestures • 영문 인식 시스템 시장에 출현 • Limited Capability • Printed style Only : C+ • Cursive Style : C- • 적극적인 활용의 장애 요인 • Pattern Recognition 학계의 활발한 연구 주제

  16. 필기 문자 • A Sequence of some writing units • Temporally ordered • (mostly) left-to-right

  17. 필기 문자 인식 • Source of Difficulty • Static Variability - personal style • Dynamic Variability - shape deviation • Stroke connection - coarticulation effect • Problems to solve Free-Writing • Variability Modeling • simple model for high flexibility • Resolve coarticulation • segmentation problem

  18. Handwritten Roman Styles

  19. Non-Roman Character Recognition • 동양언어권에서의 펜 인식 요구가 강함 • 일본의 PDA products • 상자안에 쓴 KANA • 또박 또박 쓴 한자 • 상자안에 쓴 영문자 • 중국인들의 노력 • 한국에서의 노력 • 한글 실용화 수준에 도달 • KAIST에서 수년간 연구

  20. Printed All alphabet separated One Syllable in a box Cursive Ligature within Syllable One Syllable in a box Handwritten Hangul Styles Cursive Ligature within Syllable Syllables may Overlab Spatially Cursive Ligature over Syllables

  21. Approaches for Recognizer Development • Knowledge-based Approach • Structural / Feature based • heuristics / Fuzzy • Encoding of expert knowledge • Data-driven Approach • Decision Theoritic • Artifical Neural Network • Hidden Markov Model • Training procedure

  22. Recognizer vs. Recognizer Generator Give a man a fish, and he’ll eat for a day Teach him to fish, and he’ll eat for a lifetime - Laotse

  23. KAIST Online 필기문자 인식 연구 • 한글 인식기(2 개의 국내 특허) • 무제한 필기 형태, 백지위에 연속 필기 가능 • 인식률 95+% • 무제한 연속 필기 영문자 단어(US Patent) • Boxed, Run-on, unconstrained cursive word • 약 88% 의 인식률 • 한자 인식기 • Gesture Recognizer • 응용 연구 • 도형편집기 • Arithmetic Tutoring System 등

  24. 필기 문자 인식을 위한 KAIST의 HMM 기반 방법론 • Variability is modeled with HMM • Alphabet as character model • Stroke connection as ligature model • as a separate entity • Viewing handwritten word as an alternating sequence of character model and ligature model • Network of HMM • knowledge of language utilized • Hierarchy of Networks • component - character - word - sentence

  25. Hidden Markov Model • Stochastic model of process with uncertain and incomplete information • Doubly stochastic process • transition parameters model temporal variability • output distribution model spatial variability • Efficient and good modeling tool for • sequences with temporal constraints • spatial variability along the sequence • real world complex processes • Highly successful for speech recognition

  26. Why HMM for Handwritten Character and Ligature ? • HMM is doubly stochastic model • Spacial variation and temporal variation • Well developed Search Algorithm • Viterbi algorithm • Welll developed training algorithm • Baum-Welsh algorithm • A model is represented by either a path of the HMM or the set of all paths of the HMM

  27. HMMThree Problems • What is the probability of generating an observation sequence? • calculating the model-input matching score, i.e, likelihood. P [ X = x1, x2, ..., xT |  ] = ? • What is the most probable transition sequence? Q* = argmax P [ Q, X |  ] • How do we estimate or optimize the parameters in an HMM? • training problem

  28. Model of Handwritten Word • View Handwritten word as alternating sequence of characters and ligatures • Handwritten words are modeled as Network of character and ligature models • Ligature as separate entity

  29. HMM for Character and Ligature • Character and ligature are viewed as sequence of chain codes • Each character and ligature model as Hidden Markov Model • Produce Probability as its score ligature model character model

  30. Character Model • Character • atomic units of handwriting • consistency in shape • small number of models • Characrter Model • HMM-based • model variability in time(length) and shape • simple left-to-right model • small number of states ( < 10)

  31. Ligature Model • Ligature • Between-letter stroke pattern • pen-up or pen-down dragging • linear or slightly curved • Ligature Model • HMM-based • represent connecting pattern and variation • simple model structure (1 - 3 states)

  32. Input Data Encoding • 16 directional chain coding • 0 ~ 15 for pen-down movement • 16 ~ 31 for pen-up movement • 32 for small dot 3 2 18 1 17 0 16 31 15 14 30 pen down movement pen up movement ==> (11, 11, 13, 15, 2, 22, 22, 22, 1, 1, 1) down up down

  33. Hidden Markov Modelling • We are searching the max. probability model M* such that P(M*|X) = max P(X|M) P(M) M* = argmax P(X|M) P(M) • HMM produces P(X|M) which is interpreted as the degree of matching between model M and given data sequence X M M

  34. Network Approach for Word Recognition • One HMM for each word is unmanageable • HMMs are interconnected • represents word construction rules from characters and ligatures (English) • represents syllable construction rules from phonetic symbols and ligatures (Hangul) • Node represents start and termination of chain code sequence of the character • Arc represents character HMM

  35. English Word Network • Writing sequence assumed • left-to-right, as the sequence of appearance • Delayed strokes are allowed • Circular Network • Initial node for the start of word • Final node for the end of word • Circular path thru ligature arcs • Based on character HMM • Special treatment for delayed stroke • Ligature grouping • based on starting and terminating location

  36. Circular Network for English Words

  37. Hangul Syllable Network • Writing sequence assumed • first consonant, vowel, last consonant if exist • Layered network • Initial node for the start of syllable • Final node for the end of syllablle • Based on Symbol HMM • Ligature grouping • based on starting and terminating location • Null Transition if no last consonant exist

  38. Initial Node Final Node BongNet : Hangul Syllable Model Consonant Ligature Vowel Ligature Consonant

  39. Recognition Problem • A Path corresponds to a Hangul Syllable / English word • Complete sequence of states and arcs from initial node to final node • Recognition • Finding the maximal probability path for given input chain code sequence • Yields optimal segmentation and character label, simultaneously

  40. Max Prob. Path Finding (Hangul)

  41. Max Prob. Path Finding (English)

  42. Model Training Procedure 1) Collect Handwritings with correct label 2) Preprocessing and encoding 3) Manual segmentation 4) Collection of each character and ligature ligature grouping to reduce the number of models 5) Estimating Model parameters Baum-Welch algorithm

  43. Cursive Hangul Recognition

  44. Hangul Recognition (Segmentation Result)

  45. Unconstained English Word Recognition

  46. Unconstained English Word Recognition (Segmentation Result)

  47. Advantages of Network Approach • Segmentation and character labels obtained simultaneously • No external segmentation needed • Segmentation is obtained from global view point • Hierarchy of HMM can be constructed • Handwritten sentences with word model and interword gap model • Smooth Integration with postprocessing / language models • Framework for unified recognizer of multiple languages

More Related