190 likes | 194 Views
Searching Scholarly Literature: A Google Scholar Perspective. Anurag Acharya. Overview. Goals & key ideas Support for libraries Coverage & usage Reflections. Goal: Best possible scholarly search. Single place to find scholarly material All areas, all sources, all languages, all time
E N D
Searching Scholarly Literature: A Google Scholar Perspective Anurag Acharya
Overview • Goals & key ideas • Support for libraries • Coverage & usage • Reflections
Goal: Best possible scholarly search • Single place to find scholarly material • All areas, all sources, all languages, all time • Relevance-based ordering (“Google-like”) • Easy to use • Common queries should just work • Researchers, like everyone else, just want answers
Idea: Index all forms of articles • Preferred form: fulltext • Go beyond author identified features • Facilitate serendipity • Fulltext online for only small fraction • Influential/seminal papers still offline • Index whatever form is available • Abstract or even just the citation
Idea: Be inclusive • Provide worldwide visibility to all research • Should be able to find research done anywhere • Who knows what triggers discovery • Our goal is to find all scholarly work • Journals, conferences, preprints, reports • All countries, all languages, all sources • Make decisions on a per-article basis • Good work can come from anywhere!
Idea: Universal discovery • Free to all users everywhere • Should be able to find relevant research no matter where you live • Don’t know where the next magic will come from • Access will depend on variety of factors • Impact of discovery is larger than people think
Idea: Rank as researchers do • Ideal: The Stuff I Need To Know • Approximation: Relevant stuff that is likely to be good • How to estimate “likely to be good”? • who wrote it, where it was published, how many people cite it, where citations are from • Plus usual information retrieval techniques
Idea: Automate citation extraction • Necessary to be able to scale • Much variance in citation styles • Widely different conventions • Citations error-prone • Desire to compress (unusual abbreviations) • Author sloppiness + error propagation • Need to normalize citations
Idea: Rank works, not instances • Single work may have many forms/versions • Preprint, report, conference paper, journal article • Each may be cited independently • Need to collect citations for true import of work • Grouping versions facilitates ranking/presentation • Collect citations for all versions – improve ranking • Present a single work as a unit – easier to scan
Idea: Links to offline content • Only a small fraction of articles online • Libraries hold huge repositories • Books, journals, articles, and much more • Link to library resources • Help users find the wealth in their libraries
Support for libraries • Library Links • Links to resources in a given library • For libraries that use link resolvers/OpenURLs • About 325 participating libraries, growing rapidly • Library Search • For libraries participating in OCLC’s Open WorldCat • Find nearby libraries that have the book • Looking to work with other union catalogs!
Google Scholar Coverage • Commercial publishers & scholarly societies • Fulltext from all major except Elsevier and ACS • Includes popular papers from all publishers as citations/abstracts • Hosting services – many publishers, societies • Highwire, AllenPress, MetaPress, Atypon, Ingenta, MUSE, others • Public A&Is – PubMed, ADS • Fairly complete, no matter what you read in some reviews…. • Open web and institutional repositories • Arxiv.org, Repec, pubmedcentral, others • Open access journals – all we can find (including Scielo)
Worldwide usage • Countries with the most queries: • US, UK, Australia, Germany, Mexico, Brazil • Canada, China, Netherlands, India, France • Japan, Israel, Italy, Taiwan, Spain • Switzerland, Colombia, Nigeria, Philippines • S. Africa, S. Korea, Malaysia, Egypt, Turkey
Reflections • Audience will expand beyond scholars • Esp for health/medical research, maybe others • Educated laypeople, patients, care-givers • The service is useful today for many users • US as well as internationally • Much more still to do to reach goals
Finally… Mendel's concept of the laws of genetics was lost to the world for a generation because his publication did not reach the few who were capable of grasping and extending it; and this sort of catastrophe is undoubtedly being repeated all about us, as truly significant attainments become lost in the mass of the inconsequential. • As We May Think(Vannevar Bush), July 1945 • Hope: loss of Mendel’s laws never repeated