170 likes | 314 Views
An IPC-based vector space model for patent retrieval. Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting Chiu. 國立雲林科技大學 National Yunlin University of Science and Technology. 2011 IPM. Outline. Motivation Objective Methodology Experiments Conclusion Comments. Motivation.
E N D
An IPC-based vector space model for patent retrieval Presenter: Jun-Yi Wu Authors: Yen-Liang Chen, Yu-Ting Chiu 國立雲林科技大學 National Yunlin University of Science and Technology 2011 IPM
Outline • Motivation • Objective • Methodology • Experiments • Conclusion • Comments
Motivation • The weakness in traditional VSM is that the indexing vocabulary changes whenever changes occur in the document set, or the indexing vocabulary selection algorithms, or parameters of the algorithms, or if wording evolution occurs.
Objective • The major objective of this research is to design a method to solve the afore-mentioned problems for patent retrieval. • The proposed method utilizes the special characteristics of the patent documents, the International Patent Classification (IPC) codes, to generate the indexing vocabulary for presenting all the patent documents.
Methodology Patent DB • Phase 1: Collect patent documents
Methodology • Phase 2:Text preprocessing
Methodology • Phase 3: Generate category * term vectors 8
Methodology • Phase 4: Generate term * category vector • Phase 5: Generate document * category vectors 10
Experiments 11
Experiments 12
Experiments 13
Experiments 14
Experiments 15
Conclusion • A novel method, IPC-based VSM, was proposed for generating vectors to represent patent documents. • The indexing vocabulary generated in IPC-based VSM was better at finding similar documents than either of the traditional methods. 16
Comments • Advantage • IPC_based SVM better than previous methods. • Application • Information Retrieval 17