190 likes | 336 Views
Structure-based Web Access Method for Ancient Chinese Characters. Xiaoqing Lu Yingmin Tang Zhi Tang Yujun Gao Jianguo Zhang Institute of Computer Science and Technology, Peking University, Beijing, 100871, China Beijing Founder Electronics CO.,Ltd., Beijing, 100085, China
E N D
Structure-based Web Access Method for Ancient Chinese Characters Xiaoqing Lu Yingmin Tang Zhi Tang Yujun Gao Jianguo Zhang Institute of Computer Science and Technology, Peking University, Beijing, 100871, China Beijing Founder Electronics CO.,Ltd., Beijing, 100085, China Center for Chinese Font Design and Research, Beijing, 100871, China State Key Laboratory of Digital Publishing Technology (Peking University Founder Group Co.,Ltd.), 100871, Beijing, China {lvxiaoqing,tangyingmin,tangzhi}@pku.edu.cn, {gao_yujun,zjg}@founder.com 2013.11.19, ChongQing, China
Outline • Background • Formalization of relationships between ACCs and modern characters • Establishment of Super Large Font • ACC Database • Implementation and results
Background (1/3) • Ancient Chinese Characters (ACCs) • Important heritage of Chinese history • Date back to at least 3300 year-old • Development is not one-dimensional • Collection, management, and access on the Internet
Background (2/3) • Problems 1 • Involves very large quantities of modern characters
Background (3/3) • Problems 2 & 3 • Lack of software code • Traditional IMEs are not suitable for ACCs
Related work • 1993, Xusheng Ji • 1994, Ning Li • 1996, Fangzheng Chen • 2003, Zaixing Zhang • 2004, Zhiji Liu • 2005, Derming Juang • 2007, Yi Zhuang • 2008, James S. Kirk • 2008, Dan Chen • ... ...
Outline • Background • Formalization of relationships between ACCs and modern characters • Establishment of Super Large Font • ACC Database • Implementation and results
2Formalization of relationships between ACCs and modern characters • Contemporary encoded characters • Existing encoded Chinese characters • Marks for uncoded Chinese characters • ACCs • Corresponding relationships with contemporary encoded characters • No corresponding relationships with the contemporary encoded characters
2Formalization of relationships between ACCs and modern characters • Two relations • Three Types of ACCs • Recognized characters • Ambiguous characters • Unrecognized characters
Outline • Background • Formalization of relationships between ACCs and modern characters • Establishment of Super Large Font • ACC Database • Implementation and results
3Establishment of Super Large Font • Automatic generation of Chinese characters [27-30] • rules regarding glyph structure decomposition • redundant expressions of glyph structures are permitted • multi-level radicals
Outline • Background • Formalization of relationships between ACCs and modern characters • Establishment of Super Large Font • ACC Database • Implementation and results
ACC Database (1/3) • Relation Schema
ACC Database (2/3) • Other relation schemas • Dynasty and Country (DC_RS), • Ancient C_Character Classification (ACCC_RS) • ACC Type (ACCT_RS) • Unicode and Glyph (UG_RS) • Radical and Component (RC_RS) • Ancient Image (AI_RS) • Contemporary Image (CI_RS)
ACC Database (3/3) • Relationships of the data tables
Outline • Background • Formalization of relationships between ACCs and modern characters • Establishment of Super Large Font • ACC Database • Implementation and results
Implementation and results • Retrieval method
谢谢! Thanks! 联系方式:Lvxiaoqing@pku.edu.cn