220 likes | 370 Views
China Patent Information For Western Users. Huabing Liu liuhuabing@cnipr.com Intellectual Property Publishing House, SIPO. Main Topics. Chinese patent information: new challenge to us Language barriers faced by western users About machine translation Our efforts. Looking at Chinese Patent.
E N D
China Patent Information For Western Users Huabing Liu liuhuabing@cnipr.com Intellectual Property Publishing House, SIPO
Main Topics • Chinese patent information: new challenge to us • Language barriers faced by western users • About machine translation • Our efforts
Looking at Chinese Patent • Essential Patent Information Resource • Third largest country of patent fillings in 2005 • Fast Increasing Rate • Highest increasing speed in the world • Improving Quality • Service application is increasing sharply • Hotspot:Automobile, Electronics, Natural Medicine
What happens in China? • 2007 • Shanghai Stock Index has increased by more than 2 times in 2007 • 17th National Congress of CPC • Attaches greater importance to IP Protection and Technology Innovation • GDP is predicted to rank third in the world. It grew by 11.5% in the first half of 2007. • 2008 • China Patent Law is revised for the 3rd time. • Beijing 2008 Olympic Games
Your Needs on Chinese Patent Information • Patent Filling and Litigation in China • Booming economy creates wider IP attention and leads to more IP lawsuit • Patent search is necessary when you prepare to apply patent or litigate in China • Patent Examination • More than 0.5 million patent and 0.9 million utility model (in Chinese language only) • Annual increase of more than 0.1 million patent and 0.15 million utility model (in Chinese language only) • Technical/Competitor Watch • Domestic patent fillings are growing fast
Everything is getting better. We all should be prepared for the challenge from China !
However, There are Problems of Patent Data…… • Poor Quality of English Data • Shortage of effective Search Tools • Lack of Machine Translation
What is in English: Bibliographic Data (invention patent, utility model) Abstract (invention patent only) Legal Status (invention patent, utility model, design ) What is not in English Abstract of Utility Model Claims Description What is missing or wrong Applicant/Inventor Mistranslated/missing Title Missing Abstract Missing Poor Quality Priority item Inefficient Search Condition of English Data
Poor Translation Sample • Missed Information • Translation Mistake
Patent Information Asymmetry EP US JP WO KR others EP US JP WO CN KR others Western User Chinese User
How to Cross Language Barriers—Commercial Vendors’ Efforts • Improve quality of English data • Develop powerful patent search tools with more effective search entries • Provide Chinese patent research service to western users • Machine translation • Might be a “mission impossible”, but it is up to us to make it possible.
MT: Which Approach is More Intelligent Classical approach Potential approach Mixed approach Human value-added approach Rule-based MT Example-based MT Statistic-based MT Hybrid MT MT+TM HAMT MAHT
A Prototype of RBMT for C-E Patent Translation Conversion Grammar Analysis Syntax Analysis Syntax Analysis Structure Selection Phrase Analysis Rule & Knowledge Base Terminology Selection Pre-processing Part-of-speech Tagging Format conversion Dictionary 1 Dictionary 2 Dictionary 3 Dictionary 4 Morphology Analysis …………… IPC-driven Dictionary Patent Input Output
Barriers of C-E MT • Chinese Language: Ambiguous Grammar • Lack of tense, voice and part-of-speech identifier • Variable expression methods • Contains highly complex logical structure • Problems of Morphology • Word Segmentation • Part of Speech • Terminology • Easy to be affected by small errors • Interpunction mistakes • Wrongly written characters Even a minor error can result in poor translation
Pros Updated rules and knowledge Richer terminology Pre-processing The quality is improving. Cons Poor syntax analysis result Insufficient term amount Ambiguous grammar Machine is not smart enough. Is MT Possible? • Conclusion • At this stage, we can not solve all the barriers in C-E MT, but human-aided MT can help us.
Blueprint of HAMT Patent Input No Translation Forecasting Special element tagging Morphology & syntax analysis Term extraction Pre-processing (Human and machine) Yes MT Patent Output Post-editing (Manual)
Cost vs Quality: Where is the Balance? The Key is: how to set up the quality standards?
Our Efforts on Chinese Patent Information-BJ. Zhongxian Tuofang • C-E Patent Machine Translation Project • Aim • Make C-E patent MT useful • Our works • Chinese linguist and NLP experts • Cooperate with China Academy of Science • Three years of R&D • 3.5M IPC-driven C-E dictionary • Large scale syntax rule base tailored for Chinese patent • Achievement • MT can demonstrate high readability levels across certain technology fields • Next step • More terms from patent are urgent in demand • How far we have progressed, we need your suggestion and evaluation
Our Efforts on Chinese Patent Information • Optimize English data • Legal Status/Designs complementarity • Missing item translation( such as utility model abstract) • Error Correction • Provide Integrated Chinese patent web service in English • C-PAT search system (Claims and Specification are also searchable) • Patent Translation (manual translation, MT, HAMT) • Patent research service by Chinese experts (Specialized in Chinese patent)