1 / 35

Overcoming language barriers in patent information search

Overcoming language barriers in patent information search. Sep. 2010, Geneva Daeshik Jeh Director General, Information Policy Bureau Korean Intellectual Property Office (KIPO). Contents. 1. Introduction. 2. KIPO’s Activities. 3. Global Efforts. 4. Conclusion. 1/34. Background.

ferris
Download Presentation

Overcoming language barriers in patent information search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overcoming language barriers in patent information search Sep. 2010, Geneva Daeshik Jeh Director General, Information Policy Bureau Korean Intellectual Property Office (KIPO)

  2. Contents 1 Introduction 2 KIPO’s Activities 3 Global Efforts 4 Conclusion 1/34

  3. Background 1. Introduction Convertibility of information based on automatic translation or interpretation may shake up everything from employment and the organization of the office, to the role of literacy in daily life… - Power Shift by Alvin Toffler 2/34

  4. Background 1. Introduction As the world continues to come together in forms such as the UN, WTO, WIPO, EU, BRICs, NAFTA, and APEC, it has become increasingly important to exchange, convert and analyze information across various languages. The EU secretariat has approximately 4,000 translators and interpreters on its payroll, which consumed around 800 million Euros in 2006. This translates to 1% of its total budget and 40% of its administrative budget. In spite of all this effort, there still remains difficulties in multi-lingual translations (e.g., Finnish → English → Hungarian). * Source : EU Website 3/34

  5. Necessities – Patent examination 1. Introduction Patent Application PCT Application * Source: WIPO website 2,000 80% 1,854,416 # of patent applications: a 26% increase from 2001 to 2007 1,701,179 1,600 70% 1,491,494 1,460,536 1,200 60% # of patent applications by non-residents: continuously increasing; reached 43.3% of the total # of applications filed in 2007 58% 58.3% 56.7% 57.4% 800 50% 802,853 725,506 621,294 613,379 43.3% 40% 400 42.6% 41.7% 42% ’01 ’03 ’05 ’07 4/34

  6. Necessities – Patent examination 1. Introduction Patent Application PCT Application PCT applications: a 48% increase from 2001 to 2007 200 60 56.0% 55.2% 52.5% 159,953 150 50 47.3% PCT applications in non-native English speaking countries: gradually increasing 136,753 115,206 108,236 100 40 The PCT has now regulated its official languages to include: English, French, German, Japanese, Russian, Chinese, Spanish, Arabic, Korean and Portuguese 30 50 ’01 ’03 ’05 ’07 * Source: WIPO website - English: US, EP, GB, CA, AU - Non-English: JP, KR, CN, DE, RU 5/34

  7. Necessities – Patent examination 1. Introduction PCT Application Patent Application PCT applications: a 48% increase from 2001 to 2007 Patent applications: a 26% increase from 2001 to 2007 PCT applications in non-native English speaking countries: gradually increasing Patent applications by non-residents: continuously increasing; reaching 43.3% of the total # of applications filed in 2007 The PCT has now regulated its official languages to include: English, French, German, Japanese, Russian, Chinese, Spanish, Arabic, Korean and Portuguese Consequently, during patent examinations, it has now become necessary to cite and refer to foreign documents as much as to domestic documents. 6/34

  8. Necessities – R&D 1. Introduction As technologies become further developed and enhanced, they become globalized beyond an enterprise’s nationality and the conventional features of an area/region. Make it mandatory for prior art searches of patent databases to be included in the planning and evaluation of R&D projects Patent information should be widely used in R&D activities and the recent advent of“Open Innovation” has made it more necessary, now than ever, to refer to foreign patent information Improve R&D projects 7/34

  9. How to overcome language barriers Demerits 1. Introduction Study the language of target country Hire multilingual search-personnel Use a machine translation system many translations in a short time low quality translations, big initial investment is required faster and high quality prior art searches takes a long time to learn and be fluent in a foreign language more understandable translation and flexible management of human resources bad prior art searches due to the lack of expert knowledge of such personnel Merits Fast and Cost-effective! 8/34

  10. Commercial Machine Translation Services 1. Introduction Lots of commercial MT services including Google are available to the public. Diverse services such as translation of web pages, translation toolbar etc. 9/34

  11. Use of Commercial Machine Translation Services 1. Introduction Demerits Merits Since commercial MT services are being continuously extended to cover many languages, almost all patent documents in the world can be translated through them. There are many free services available to the public. As they cover general sentences, they can be applied to both patent and non-patent literature. 10/34

  12. Use of Commercial Machine Translation Services 1. Introduction Merits Demerits Demerits Prior art searches through commercial MT services do not provide convenience in editing search queries. More so, search queries/results have to be copied and pasted one by one. Since commercial services are designed to support broad areas, they may be inefficient for a specialized area like patents. Many IPOs including KIPO, EPO, and JPO either have customized commercial translation engines or in-house developed ones. 11/34

  13. Machine Translation Service Status of Some Major Countries in Asia 1. Introduction Patent specific MT services targeting non-native English speaking countries such as China, Japan, and Korea KIPO and JPO have customized commercial translation engines, while SIPO’s was developed in-house. 12/34

  14. 1 Introduction 2 KIPO’s Activities MT Services 2.1 2.2 Patent Information Search Global Efforts 3 Conclusion 4 13/34

  15. Status of KIPO’s MT Services JAPANESE KOREAN ENGLISH ENGLISH 2. KIPO’s Activities – MT Services J2K Translation Service Launched in 2000 PL / NPL written in Japanese for KIPO’s examiners PL written in Japanese for the general public J2K Translation 14/34

  16. Status of KIPO’s MT Services JAPANESE KOREAN ENGLISH ENGLISH 2. KIPO’s Activities – MT Services K-PION Service K2E Translation K2E Translation Service 37IPOs Launched in 2005 For examiners of foreign IPOs and KIPO Korean patent documents JAPANESE 15/34

  17. Status of KIPO’s MT Services JAPANESE KOREAN ENGLISH 2. KIPO’s Activities – MT Services K-PION Service K2E Translation K2E Translation Service E2K Translation Service 37IPOs Launched in 2005 For examiners of foreign IPOs and KIPO Korean patent documents Launched in 2008 PL/NPL written in English for KIPO’s examiners PL written in English for the general public JAPANESE E2K Translation 16/34

  18. Specialized Machine Translation Services for Patent Documents 2. KIPO’s Activities – MT Services To improve the quality of machine translation engines, the following issues have been considered: Linguistic features - Word order (Korean and Japanese have same word order → Subject + Object + Verb phrase; while for Chinese and English, it’s Subject + Verb phrase + Object.) - Letters(English, German, and French originated from Latin characters; while Korean, Japanese and Chinese have their own characters) Digitization of patent documents - Accuracy in digitizing patent documents through OCR greatly influences the quality of machine translations. 17/34

  19. Specialized Machine Translation Services for Patent Documents 2. KIPO’s Activities – MT Services To improve the quality of machine translation engines, the following issues have been considered: Building of a patent-specific terminology dictionary Use of markup documents such as XML - e.g., KIPO has published patent gazettes in XML since February 2005. 18/34

  20. Methods of improving translation quality 2. KIPO’s Activities – MT Services Korean Patent Gazette Features of Patent documents Abstract: usually a single long sentence and thus has a high possibility of error when machine translated Specification: brief explanation of the drawing is written in a simple sentence and the other parts, in general descriptive sentences. Claims: has a hierarchical tree structure made of independent and dependent claims. Written in a noun phrase 19/34

  21. Methods of improving translation quality 2. KIPO’s Activities – MT Services Features of Patent documents Name In XML documents, the tags help users to identify the different sections as described in the previous slide. Others Abstract, Summary Description Drawings Claims Different translation protocols depending on the tag information of the patent gazette 20/34

  22. Example – Korean Patent Gazette 2. KIPO’s Activities – MT Services XML of Korean Patent Gazette Application Server K2E Translation Server REQ_HNM_KE 1. Analyze XML Tag Information 오은영 → Oh Eun Young REQ_KE 본 발명은… → This invention… 2. Adjust appropriate translation protocol REQ_ABS_KE 본 발명은… → This invention… REQ_DRDES_KE 도1은 본 발명에.. → Drawing 1 is a… 3. Translate REQ_CLAIM_KE 폐피혁을 용매에.. → Methodology of… 21/34

  23. Applicability to Patent Documents Produced by Other IPOs 2. KIPO’s Activities – MT Services Patterns distinguished in markup documents such as XML A consistent pattern depending on each item 22/34

  24. Patent Information Search using MT engines Users Users 2. KIPO’s Activities – Patent Information Search Target users and objectives of MT services - internal examiners or foreign examiners Building of a database - original documents or machine translated documents To use MT engines for patent information search, the following issues have been considered: DB (Machine translated docs.) DB (Original docs.) DB (Original docs.) Machine Translator Machine Translator * In terms of cost-benefit analysis, the former is better for low frequency of using foreign docs. while the latter is better for high frequency of using foreign docs. Formulation of search queries (e.g., operators, terminology dictionary) Screen layout / organization 23/34

  25. KOMPASS (Korean Multifunctional Patent Search System) 2. KIPO’s Activities – Patent Information Search KOMPASS targets KIPO examiners and supports patent information search in English and Japanese. It conducts integrated search in Korean, English, and Japanese, respectively. Korean integrated search function targets Korean and Japanese documents English integrated search function targets all kinds of data retrieved from English documents and the search results can be translated into Korean. Japanese integrated search functiontargets all kinds of data retrieved from Japanese documents and the search results can be translated into Korean (only for patents and utility models) (Japanese documents: database built from machine-translated documents) 24/34

  26. KOMPASS (Korean Multifunctional Patent Search System) Users 2. KIPO’s Activities – Patent Information Search KOMPASS targets KIPO examiners and supports patent information search in English and Japanese. It conducts integrated searches in Korean, English, and Japanese, respectively. Korean integrated search function targets Korean and Japanese documents English integrated search function targets all kinds of data retrieved from English documents and the search results can be translated into Korean. Japanese integrated search functiontargets all kinds of data retrieved from Japanese documents and the search results can be translated into Korean (only for patents and utility models) (Japanese documents: database built from machine-translated documents) • Japanese gazettes were previously searchable through machine translation. • Due to the rapid increase of its use by KIPO examiners, the search speed has been getting slower. Machine Translator DB (Original docs.) 25/34

  27. KOMPASS (Korean Multifunctional Patent Search System) Users 2. KIPO’s Activities – Patent Information Search KOMPASS targets KIPO examiners and supports patent information searches in English and Japanese. It conducts integrated searches in Korean, English, and Japanese, respectively. Korean integrated search function targets Korean and Japanese documents English integrated search function targets all kinds of data retrieved from English documents and the search results can be translated into Korean. Japanese integrated search functiontargets all kinds of data retrieved from Japanese documents and the search results can be translated into Korean (only for patents and utility models) (Japanese documents: database built from machine-translated documents) • In 2009, for faster search, all the Japanese gazettes were machine-translated and used to build a database. • KIPO examiners’ convenience has been greatly improved. Machine Translator DB (Machine translated docs.) DB (Original docs.) 26/34

  28. KOMPASS (Korean Multifunctional Patent Search System) Korean Search 2. KIPO’s Activities – Patent Information Search KOMPASS targets KIPO examiners and supports patent information searches in English and Japanese. It conducts integrated searches in Korean, English, and Japanese, respectively. Korean integrated search function targets Korean and Japanese documents English integrated search function targets all kinds of data retrieved from English documents and the search results can be translated into Korean. Japanese integrated search functiontargets all kinds of data retrieved from Japanese documents and the search results can be translated into Korean (only for patents and utility models) (Japanese documents: database built from machine-translated documents) Korean keyword search of Japanese documents (using J2K database) 27/34

  29. K-PION (Korean Patent Information Online Network) Foreign Examiners Translate Search results into English Automatically translated into Korean Keywords 2. KIPO’s Activities – Patent Information Search K-PION is a free search service for helping foreign examiners better understand Korean patent information (examinations, gazettes etc). It also supports an English keyword search service. service for retrieving Korean patent and utility model gazettes and examination information from original and machine-translated documents an English keyword search service for KPAs service for Korean industrial designs and trademarks including PCT related documents an English keyword search service for Korean patent and utility model gazettes Search Korean gazettes Applicant K-PION Patent Information Retrieval Extended to Korean synonyms Input English Keywords 28/34

  30. 1 Introduction KIPO’s Activities 2 3 Global Efforts IP5 Foundation Project on Mutual Machine Translation 3.1 3.2 Cross-Lingual Information Retrieval Conclusion 4 29/34

  31. IP5 Foundation Project onMutual Machine Translation 3. Global Efforts • IP 5 offices will improve the quality of machine translation (MT) services • and harmonize MT services among themselves. • Achieved by: • (Improvement of the quality of MTs) • Joint quality review of non-English to English MTs by English speaking Offices • MT system upgrade based on the quality review results • Reduction of errors in original documents • (Harmonization of MT services) • Harmonization of the contents of MT services • Regarding searches, this project will help each office to better understand the prior art documents of other offices and to use them in citations 30/34

  32. WIPO’sCLIR (Cross-Lingual Information Retrieval) 3. Global Efforts CLIR has been newly added to the PATENTSCOPE and the beta version is currently under test by the public. When searching PCT and national application data, inputted keywords can be extended into other languages such as English, French, German, Japanese, and Spanish. Linked to Google translation service; search results are available in all the languages it supports. Available in over 1.7 million published international patent applications (PCT) and in more than 3 million when patent documents from Regional and National collections are included. 31/34

  33. 1 Introduction KIPO’s Activities 2 Global Efforts 3 4 Conclusion 32/34

  34. 4. Conclusion Considering the tremendous amount of global patent information, machine translation services will be the most practical and efficient way to search patent information of other IPOs. There are many ways to implement a patent search system using an MT engine. In selecting a specific methodology, each IPO should consider the frequency of use, budget, and linguistic features. For improving the performance of MT and search systems, each IPO may consider some options such as building of a machine-translated database, patent-specific terminology dictionary, and state-of-the-art IT technologies such as XML. International cooperation among IPOs is very important for the improvement of MT quality. KIPO has done its utmost in order to overcome language barriers and enable non-Korean speakers to better access Korean patent information. KIPO will continue to collaborate with other IPOs in this regard. 33/34

  35. E-mail: daeshik@kipo.go.kr 34/34

More Related