240 likes | 252 Views
This study by Zhang Xi-Wen from CSE, CUHK, and HCI Lab at ISCAS 2005 explores improving Chinese handwriting recognition by fusing speech recognition, detailing the challenges, segmentation methods, character recognition, speech recognition, and text fusion. The experimental results and the fusion process using dynamic programming are discussed, showing enhanced recognition results through information fusion.
E N D
Improving Chinese handwriting Recognition by Fusing speech recognition Zhang Xi-Wen CSE, CUHK and HCI Lab., ISCAS 2005.4.12
Outline • 1 Chinese handwriting recognition • 2 Chinese speech recognition • 3 Information fusion • 4 Experimental results
Handwriting Recognition • Handwriting segmentation • Character recognition
1.1 Handwriting segmentation • It is more difficult for Chinese handwriting segmentation
Character extraction using histogram • A histogram of between-stroke gaps. • The dimidiate threshold of the histogram is to extract lines of strokes. • The dimidiate threshold of the histogram of a line of strokes is to extract characters.
Problems remained • A Chinese character may be mis-segmented into many characters. • Many Chinese characters may be mis-grouped as a character. • The segmentation error will inevitably result in handwriting recognition errors.
1.2 Character recognition • Isolated character recognizer from HW • Many candidates
Handwriting. Text recognized from the handwriting. The ground-truth text. Figure 2. Handwriting recognition
2 Speech recognition • Chinese speech. • On-line, microphone. • Continuous speech recognizer from MS.
Text recognized from the speech corresponding to the handwriting. The ground-truth text. Figure 3. Speech recognition
3 Text fusion • An optimization problem • Dynamic Programming
3.1 Principles • The fused text should contain more semantic information. • Construct a text with the least characters and the most semantic information.
3.2 Four ways Text recognized from the handwriting. Text recognized from the speech corresponding to the handwriting. Figure 4. Texts to be fused
3.3 Dynamic Programming • A directed graph. • Optimal paths.
(a) Text recognized from the handwriting. (b) Text recognized from the speech corresponding to the handwriting. (c) The optimal fused text corresponding to the optimal path. (d) The ground-truth text. Figure 6. Text fusion using DP.
3.4 A language model • Lexicon • Syntax • Semantic
Thank you very much for • your criticism, comments and suggestions! • Email: xwzhang@cse.cuhk.edu.hk • Tel: 3163-4260