1 / 39

OVERCOMING LANGUAGE BARRIERS IN PATENT INFORMATION SEARCHING – A COMMERCIAL PERSPECTIVE FROM THOMSON REUTERS

OVERCOMING LANGUAGE BARRIERS IN PATENT INFORMATION SEARCHING – A COMMERCIAL PERSPECTIVE FROM THOMSON REUTERS. Rob Willows Vice President - Patent Offices and Special Accounts Thomson Reuters IP Solutions September 2010. PRESENTATION FRAMEWORK. Thomson Reuters today The Language Challenge

albert
Download Presentation

OVERCOMING LANGUAGE BARRIERS IN PATENT INFORMATION SEARCHING – A COMMERCIAL PERSPECTIVE FROM THOMSON REUTERS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OVERCOMING LANGUAGE BARRIERS IN PATENT INFORMATION SEARCHING – A COMMERCIAL PERSPECTIVE FROM THOMSON REUTERS Rob Willows Vice President - Patent Offices and Special Accounts Thomson Reuters IP Solutions September 2010

  2. PRESENTATION FRAMEWORK Thomson Reuters today The Language Challenge • The languages of patents Thomson Reuters Foundation approach – translation as a component of Derwent World Patents Index® (DWPISM) value-added patent information • Translation of patents from 44 Authorities into 1 common language – English • Value added Abstracts, Titles, Keywords and Coding • Creation of the unique DWPI patent family

  3. PRESENTATION FRAMEWORK First introduction of technology = Machine Assisted Translation of Japanese patent full text Recent approaches • Human translation (dedicated resources) of source text • Bulk and “on the fly” Machine translation of source text • Local language interfaces Outlook for the future Lost in translation; the impact of errors in original data Commercial challenges Conclusions

  4. THOMSON REUTERS TODAY MARKETS DIVISION PROFESSIONAL DIVISION Sales & Trading Investment & Advisory Healthcare & Science Tax & Accounting Enterprise Media Legal IP SOLUTIONS Largest Provider of Intelligent Information Thomson Reuters is the largest provider of intelligent information to business and professional customers in the world. Generating a total revenue of $13.0bn in 2009. True Global Presence We operate in 300 cities in over 100 countries across the world. Publicly Traded We hold ourselves accountable through compliance with Sarbanes Oxley and a stringent code of business ethics. Strong Brand Named #40 in the BusinessWeek 2009 ranking of the 100 Best Global Brands.

  5. THOMSON REUTERS IP SOLUTIONS Powering the Intellectual Property Lifecycle with the world’s most comprehensive resources… TRADEMARKS& BRAND MANAGEMENT TM PATENTS & SERVICES IP LAW INTELLIGENT INFORMATION ADVANCED TOOLS & ANALYTICS EXPERT IP SERVICES

  6. MARKET DYNAMIC:GLOBALIZATION IMPACT ON IP Global nature of IP impacting how organizations maintain competitive advantage in emerging growth markets Increased Asian patent and trademark filings China & Korea increasingly source of innovation creating opportunities & risks Patent & non-patent prior art research demanding improved global coverage

  7. IMPORTANCE OF ASIAN PATENTS

  8. THE LANGUAGES OF PATENTS Asia & Middle East North America Europe Africa Australia & Oceania South America

  9. FOUNDATION APPROACH - DWPI Abstracts for all countries in English Abstracts written by analysts all using the same guidelines – provides consistency and removes legal jargon Abstracts based on entire patent specification, including drawings

  10. THE LANGUAGES OF DWPI North America Languages:English, French Patent authorities:CA, US

  11. THE LANGUAGES OF DWPI Europe Languages:Czech, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Portugese, Romanian, Slovak, Spanish, Swedish Patent authorities:CZ, CS, DK, SE, NL, BE, GB, IL, CH, FI, FR, LU, DE, DD, AT HU, IT, NO, PT, RO, SK, ES, EP, WO (RD,TP)

  12. THE LANGUAGES OF DWPI Africa Languages:Afrikaans, English Patent authorities:ZA

  13. THE LANGUAGES OF DWPI South America Languages:Portugese, Spanish Patent authorities:BR, AR, MX

  14. THE LANGUAGES OF DWPI Australia & Oceania Languages:English, Maori Patent authorities:AU, NZ

  15. THE LANGUAGES OF DWPI Asia & Middle East Languages:Chinese (Mandarin), English, Hebrew, Hindi, Japanese, Korean, Russian Patent authorities:CN, TW, IN, IL, PH, SG, JP, KR, RU, SU, DD, WO

  16. THE LANGUAGES OF DWPI 44 sources 23 languages 7 major linguistic families

  17. THOMSON REUTERS ASIA PACIFIC COVERAGE Derwent World Patent Index ® Thomson Innovation

  18. FIRST INTRODUCTION OF TECHNOLOGY – JP MACHINE-ASSISTED TRANSLATION Machine translation algorithms and dictionaries developed over decades Team of MAT analysts based in Japan

  19. JP MAT - WHAT IS IT ? Human Machine Assisted Translations (MAT) of JP full textdocuments Human intervention i.e. manual correction • Specific tagged fields (author abstract, claims, use, advantage, etc)  in the output are scanned for: (i) non-translated JP text (ii) failed translation • Manual enhancement consisting fixing  error/non-translations in records according to ranking and priority • Term registration, dictionary enhancement and addition of new rules take place

  20. JP MAT – TRANSLATION VARIANTS Multiple translations for single term can exist • Occurs when differing meaning possible • Terms separated by ¦ • Example: “..application of the MIMO transmission technique using several antennas and OFDM strong against multipass|multipath transmission is performed briskly conventionally..” All terms fully searchable, so improved recall

  21. HOW JP MAT HAS BEEN INTEGRATED INTO IP SOLUTIONS PRODUCTS AND SERVICES Provides the source material for the value-added processing of JP patents into DWPI As searchable English language full text in the Asia patents collection on Thomson Innovation As data feed and Web service options for customers to integrate into their legacy in-house systems

  22. CURRENT APPROACHES Human translation (dedicated resources) of source text Machine translation of source text – bulk and on the fly Local language interfaces Seamless link between the DWPI value add record and translated and/or original language full text documents on our delivery platforms e.g. Thomson Innovation

  23. HUMAN TRANSLATION – CHINESE PATENTS Translated title, abstract and claims from January 2007 onwards for: Applications Utility models

  24. MACHINE TRANSLATION - KOREAN PATENTS Includes application, grants & utility models Coverage from January 2008 onwards MT translation of complete document

  25. TRANSLATION ON THE FLY Chinese French German Italian Japanese Korean Portuguese Russian Spanish

  26. LOCAL LANGUAGE INTERFACES Searching Japanese patent collections in Japanese Search local language data and global content in market-leading platform

  27. NON-PATENT LITERATURE Non-patent prior art research demands improved global coverage Index of journal literature of the sciences, published in Chinese. English-language bibliographic data and abstracts from 2002 1,200 journals in all areas of science, 2,000,000 records Coverage of Agricultural Sciences, Biology, Chemistry, Computer Science, Engineering, Geosciences, Management, Mathematics, Medicine

  28. OUTLOOK FOR THE FUTURE Enhanced machine translation of bulk data Query translation/search against original text Enhanced machine translation on the fly • Into English • From English into local language (already in place in Thomson Innovation) Monitor ongoing and future developments for implementation when viable

  29. IMPACT OF ERRORS IN THE ORIGINAL DATA ON MACHINE TRANSLATION Quality of the original data is essential and the translations are affected by • misleading punctuation • misspellings • wrong word order • missing or repeated words

  30. LOST IN TRANSLATION I Erroneous Korean Original text Machine translation 하프톤 마스크의 반투과부 결함 수정 방법 및 이를 이용한 리페어된 하프톤 마스크 The method of repairing defect in semi-premerable portion and the halftone mask which becomes this with the usage heartburnings repair of the halftone mask. Spacing error in the original Korean text Korean original text after correction of errors Machine translation The method of repairing defect in semi-premerable portion of the Halftone mask and the repaired Halftone mask using the same. 하프톤 마스크의 반투과부 결함 수정 방법 및 이를 이용한리페어된 하프톤 마스크

  31. LOST IN TRANSLATION II Erroneous Korean Original text Machine translation 그리고 사용중에 파손이 안돼도록 어느정도 압력으로 되었을때 끊어질수 있도록 고안을 해서 연결관을 만든다. And in order to be cut when damage to some extent consisted of pressure in busy with the An DwaeDo lockit designs and the connection pipe is made. Misspelling in original Korean text Korean original text after correction of errors Machine translation And the connection pipe is made to be cut when reached to some extent pressure in order not to be damaged in use. 그리고 사용중에 파손이 되지 않도록 어느정도 압력으로 되었을때 끊어질수 있도록 고안을 해서 연결관을 만든다.

  32. LOST IN TRANSLATION III Erroneous Korean Original text Machine translation 밀봉 플레이트(70)는, 제 1 측면(72), 대향하는 제 2 측면(74) 및 상기 제 2 측면 상에 위치된 밀봉부(76)를 포함하며, 밀봉부는 밀봉 플레이트를 둘러싸고 있다 The seal plate (70), is the first side surface (72),and the faced second side (74) and the encapsulant (76) located on the second side are included. And encapsulant surrounds the seal plate. Misuse of comma in original Korean text Korean original text after correction of errors Machine translation The seal plate (70) includes first side surface (72), and the faced second side (74) and the encapsulant (76) located on the second side. And encapsulant surrounds the seal plate. 밀봉 플레이트(70)는제 1 측면(72), 대향하는 제 2 측면(74) 및 상기 제 2 측면 상에 위치된 밀봉부(76)를 포함하며, 밀봉부는 밀봉 플레이트를 둘러싸고 있다.

  33. LOST IN TRANSLATION IV Misspelling in the original Korean text Erroneous Korean Original text Machine translation The invention relates to refrigerator, more specifically, to the cold air circulating apparatus and method of the refrigerator which inhales the cool air to the drive of the ventilation fan and controlled so that the inside of refrigerator cooling air circulation be made. 본 발명은 냉장고에 관한 것으로, 더욱 상세하게는 송풍팬의 구동으로 냉기를 흡입하여 고내 냉기순환이 이루어지도록 제어하는 냉장고의 냉기순환장치 및 방법에 관한 것이다. Machine translation Korean original text after correction of errors The invention relates to refrigerator, more specifically, to the cold air circulating apparatus and method of the refrigerator which inhales the cool air to the drive of the ventilation fan and controlled so that the cooling air circulation be made in the inside refrigerator. 본 발명은 냉장고에 관한 것으로, 더욱 상세하게는 송풍팬의 구동으로 냉기를 흡입하여 냉장고내 냉기순환이 이루어지도록 제어하는 냉장고의 냉기순환장치 및 방법에 관한 것이다.

  34. TRANSLITERATION ERRORS KVAERNER MASA YARDS OY SAEIKKOE J; VEIKKOLAINEN M KEVANAL MASHA-YADES OY J. SACO; M. WIKLANIN No priority data

  35. NON CONVENTION EQUIVALENTS IN DWPI PATENT FAMILY WPI Acc no: 2002-306437/200235XRPX Acc No: N2002-239587 Welding structure formation for building applications, involves controlling welding of the constituents arranged on support surface, based on position of weld points determined from recorded image Patent Assignee: KVAERNER MASA YARDS OY (KVAE-N); KVAERNER MASA-YARDS OY (KVAE-N); SAIKKO J (SAIK-I); VEIKKOLAINEN M (VEIK-I); AKER FINNYARDS OY (AKER-N) Inventor: SAEIKKOE J; SAIKKO J; VEIKKOLAINEN M (1) Basic (2) Equivalents (E) #Non-conventionEquivalents(NCE)

  36. INPADOC FAMILY IS INCOMPLETE

  37. COMMERCIAL CHALLENGES • Investment planning • Enhancements to coverage and treatment • China; India; Korea; Switzerland; Taiwan; Brazil; Spain • Increases in volumes • The number of basics has doubled over the past 10 years • 1.491 million basics projected in 2010 • Sourcing and managing original patent data • Primary duty of patent offices is to grant patents; information dissemination is secondary • Data provided in multiple different formats • GIGO – much effort required to identify and correct errors in source material

  38. CONCLUSIONS The many different languages of patent documents present unique challenges Translation is essential for extracting useful information, but costly Tools and techniques are improving, BUT… We will continue to rely on high quality translation of patent information, by various techniques, reinforced by the skill sets of our value-add production team in order to deliver the Thomson Reuters value-add proposition

  39. THANK YOU THOMSON REUTERS – IP SOLUTIONS IP.THOMSONREUTERS.COM

More Related