310 likes | 429 Views
6 th Biennial International Triple Helix Conference on University-Industry Government-Links Singapore, May 16-18 , 2007. Biomedical innovation at the laboratory, clinical and commercial interface. Mapping research grants, publications and patents in the field of microarrays.
E N D
6th Biennial International Triple Helix Conference on University-Industry Government-Links Singapore, May 16-18 , 2007 Biomedical innovation at the laboratory, clinical and commercial interface.Mapping research grants, publications and patents in the field of microarrays • Andrei Mogoutov, Alberto Cambrosio, Peter Keating & Philippe Mustar
Main goals of this paper: • To analyze biomedical innovation by triangulating three sources of information: publications, patents and research projects (see Verganti et al.) • In particular: to develop a methodology for linking publication, patent and project databases by using emergent (rather than pre-established) categories • Methods: • Heterogeneous network analysis (ReseauLu X2) • Text-mining (SPSS LexiQuest Mine)
Case study: Microarrays • A DNA microarray (a.k.a as biochip, DNA chip, gene array, etc.) is a collection of microscopic DNA spots, commonly representing single genes, arrayed on a solid surface by covalent attachment to chemically suitable matrices • Compared to previous molecular genetic approaches, a microarray experiment involves the simultaneous analysis of many hundreds or thousands of genes, as opposed to single ones • Microarrays have become a key technology of the (post)genomic era • Annual compounded growth rate of the microarray market between 1999-2004: 63%
Databases • Publications: • PubMed: robust keyword system; biomedical • Web of Science: addresses and citations; general S&T • [PubMed/WoS intersection] • Research Projects: • CRISP: NIH-financed projects; biomedical • [NSF] • Patents: • Derwent Innovation Index • [USPTO]/ [EUPTO]
regulatory agency Institutional network (4 nearest nodes) biotech company hospital university
Journal inter-citation network (5 nearest nodes) cancer cluster
2. Database bridges • 2a. Via authors and pre-established (institutional) categories
Link via authors Categories by Institutes CRISP projects vs. Publications
Link via authors Categories by Institutes CRISP projects vs. Citations
Link via authors Categories by Institutes CRISP projects vs. Patents
2. Database bridges • 2b. Via content (emergent categories)
Text mining: SPSS LexiQuest Mine and Text Mining Builder Dictionary interface Concept extraction
Methodology for generating emergent categories • The chosen database is text-mined (NLP software) to extract the relevant concepts (composite terms and uniterms): • in the present case, WoS was chosen over CRISP because it includes biomedical and non-biomedical domains • The most relevant (specific) concepts are selected by using a ChiSq filter • After building a co-occurrence map (nearest nodes), clusters corresponding to sub-domains are identified by a modified fuzzy K-means clustering algorithm • The list of concepts defining each sub-domain is used to analyze the other databases
Acknowledgments • Research for this paper was supported by grants from: • CIHR • FQRSC • SSHRC