1 / 56

Breaking and remaking peer review with the SPIRES databases: Our Experience

Travis Brooks SPIRES Scientific Databases Manager Stanford Linear Accelerator Center Pat Kreitz Director, Technical Information Services Stanford Linear Accelerator Center Thanks to Ann Redfield, Michael Peskin, Louise Addis, Heath O’Connell, and Georgia Row for useful input.

lpotter
Download Presentation

Breaking and remaking peer review with the SPIRES databases: Our Experience

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Travis Brooks SPIRES Scientific Databases Manager Stanford Linear Accelerator Center Pat Kreitz Director, Technical Information Services Stanford Linear Accelerator Center Thanks to Ann Redfield, Michael Peskin, Louise Addis, Heath O’Connell, and Georgia Row for useful input. Breaking and remaking peer review with the SPIRES databases: Our Experience Travis Brooks-Trieste

  2. Topics • Part I • History and current situation of SPIRES, arXiv, and Journals • Part II • Citation counting: our experiences and views • Part III • Speculation for the future Travis Brooks-Trieste

  3. Part I Some history, some current data, and some guesses Travis Brooks-Trieste

  4. What is SPIRES? • Bibliographic records for over half a million papers • Entire literature of High-Energy Physics (HEP) • Many papers from related fields • Citations for e-prints and journal articles • Over 25,000 searches a day • Main site and personnel at SLAC • DESY, FNAL, Durham U., Kyoto U, IHEP (Moscow) Travis Brooks-Trieste

  5. arXiv • Since 1991: • Makes full-text available for download • Links to SPIRES citation lists • Allows revisions • Divides content into hep-th, hep-ph, hep-ex and many other categories Travis Brooks-Trieste

  6. hep-th vs. hep-ex • Sharp distinction between Theory and experiment • Different from other disciplines • Difference between the publishing cultures of the HEP theorist and the HEP experimentalist Travis Brooks-Trieste

  7. Experiment: Large Collaborations (>500 authors) Difficult to referee Reporting results Theory (my focus): Small collaborations (<10 authors) Self-contained papers Conversational hep-th and hep-ph similar th vs. ex Publishing Travis Brooks-Trieste

  8. hep-th (Pr)eprints: A Timeline • Mid 1960’s preprints sent by authors to select groups • 1969 SLAC library began ppf (preprints in particles and fields) list • Created demand for distribution • Legitimized preprints/preprint libraries • Led to anti-ppf list Travis Brooks-Trieste

  9. hep-th (Pr)eprints: A Timeline • 1974 SPIRES-HEP database indexed preprints • Allowed more general, worldwide, distribution and retrieval of preprint titles • Still needed papers by mail • Preprints used conversationally • On WWW in 1991 Travis Brooks-Trieste

  10. hep-th (Pr)eprints: A Timeline • 1991 arXiv.org allowed immediate and universal electronic access to full-text of preprints • Preprints became eprints • Demise of all HEP journals predicted Travis Brooks-Trieste

  11. Preprints not new… • arXiv is a logical extension of the movement towards preprints, not a “bolt from the blue” • Preprints have a long history of use • Preprints are more easily distributed today Travis Brooks-Trieste

  12. History of hep-th arXiv • arXiv is busy • Over 90% of papers published in Phys. Rev. D after 1995 were submitted to arXiv • But authors still publish! • 75% of hep-th papers (prior to 2002) have been published Travis Brooks-Trieste

  13. When are eprints published? • Difference between Phys. Rev. D publication time and eprint appearance time • 6,000 articles from June 1997-2003 • Mode at 5 months • 17 negative times not shown Travis Brooks-Trieste

  14. When are they published? • What caused the negative times? • Are the large delays from “testing the waters?” • Do researchers wait for peer review to determine if an article is worth reading? Travis Brooks-Trieste

  15. When are papers read? • Q:When does most citing occur? • A:Plot the citations a published hep-th article receives after its arXiv submission • 8000 published papers in sample • Includes citations from journal papers and arXiv papers (essentially the same set) Travis Brooks-Trieste

  16. Eprints, not journals • Journal lag time 5 months • Citation peak occurs after eprint release, not journal release • Inference:HEP theorists don’t wait for the journal. Travis Brooks-Trieste

  17. Current hep-th situation • Researchers read the arXiv to find out the latest scientific information • They base their work on what is in the arXiv • Scientific priority is given by arXiv time stamp, not journal submission date • They barely notice if it is published Travis Brooks-Trieste

  18. HEP theorist’s viewpoint • arXiv is for immediate communication • A running scientific conversation • Overheard about a paper not sent to hep-ph: “He didn’t publish it, he just sent it toPhys. Rev. D” Travis Brooks-Trieste

  19. Journals Irrelevant? • 75% of hep-th papers (prior to 2002) have been published • Correlation between large cite counts and publication • Journals are still very much alive Travis Brooks-Trieste

  20. Why do authors publish?(4 guesses) • 1-Inertia • There is no other system as developed or as trusted • Journals are ingrained in researchers’ psyches • But journals don’t appear to be going away (quickly) Travis Brooks-Trieste

  21. Why do authors publish? • 2-Feedback • Refereeing is useful for this paper and the next • The paper is already on arXiv while it is being refereed • But arXiv submissions generate comments and revisions as well Travis Brooks-Trieste

  22. Why do authors publish? • 3-Professional Advancement • Do tenured/secure faculty publish fewer of their eprints? • Anecdotally: Witten seven 50+ cited papers as eprints only • In general: interesting question to think about… • If professional advancement is the sole purpose of peer-review, could we not do better? • Are we using the peer review process as a substitute for performance evaluation? Travis Brooks-Trieste

  23. Why do authors publish? • 4-Archival value • Do authors believe that arXiv is a good archive? • Will arXiv only eprints still be around (readable, accessible) in 100 years? • Perception, not reality, matters here • E-only journals appear no different • Centralization, not media, should be the concern Travis Brooks-Trieste

  24. Part II Cite counts and the future Travis Brooks-Trieste

  25. Cite Counting • Cite counts present a data-driven picture of the hep-th eprint culture • Much work already (by many here today) • Cites to HEP eprints from journal articles are high and rising (Brown 2001, Youngen 1998, others) • arXiv impact factor is similar to journals (Fabbrichesi and Montolli, 2001) • Many other studies (often using SPIRES-HEP data) Travis Brooks-Trieste

  26. Cite Counting • Cite counting for bibliometric purposes seems reasonable (perhaps) • Cite counting for peer review purposes? • Services like SPIRES (free) and ISI (fee) make cite counts available to other researchers, hiring committees, and tenure review boards. Travis Brooks-Trieste

  27. Cite Counts = Peer Review? • Are citations the electronic answer to refereed journals? • Currently the only answer • Only one widely available • But not a very good answer • arXiv + SPIRES cite counts are not Phys. Rev. Lett. Travis Brooks-Trieste

  28. Cites: Pros and Cons • SPIRES has been making citations available for over 25 years • We have noticed a few things about the process • Some good • Some bad • Some merely interesting Travis Brooks-Trieste

  29. Advantages-Dynamic • Cite counts change with the field • Classics • New papers • Newly discovered classics • Ex:Weinberg’s Standard Model paper • Few cites initially • Over 5,000 now • Ex:M. Peskin’s topcite reviews Travis Brooks-Trieste

  30. Advantage-Fast • Cite counts begin immediately after appearance • Electronic publishing means peer review is the lag time • Lag time makes journals archivists rather than communicators • Led to the replacement of this function by arXiv/SPIRES/etc. Travis Brooks-Trieste

  31. Advantage-Easy • SPIRES tracks citations with 4 staff members • Total staff is about 8 • We are not that technically sophisticated • We are not even especially clever! • Still it is non-trivial Travis Brooks-Trieste

  32. Disadvantage-Accuracy • Speed, ease rely on electronic processing • Accuracy or speed? • Reference lists in a paper change over an article’s life • What counts as a cite? • Which version of the paper? Travis Brooks-Trieste

  33. Disadvantage-Relevance • Theory:Citations are a measure of what scientists read • But Does Citing = Reading ? • Simkin & Roychowdhury (cond-mat/0212043 and cond-mat/0305150) • Students, general public Travis Brooks-Trieste

  34. Disadvantage-Relevance • Theory:Cites are a mark of quality • What about brilliant papers out of the mainstream? • Are papers really even referenced for scientific reasons? • Or are they referenced for sociologic reasons? • Or are references simply copied? Travis Brooks-Trieste

  35. Disadvantage-Relevance • Tongue-in-cheek reasons for not citing prior work (humorous, but not far off…) • “If it’s old, foreign—or—old and foreign” • “They don’t cite us either” • “Rain forest preservation through paper-saving” • “I figured if you’re smart enough to read this paper, you already knew that!” from The Scientist Travis Brooks-Trieste

  36. Interesting-Importance • People take it seriously • Funding, careers, reputations, etc. are perceived to depend in some way on SPIRES citation data Travis Brooks-Trieste

  37. Interesting-Importance • We receive ~50 emails a day, most of them revolving around incorrect, incomplete, or missing references • Usually from an author whose paper was cited but missed • Often marked “URGENT” • Occasionally with panicked explanations including the date that the review committee is meeting • Sometimes accusing SPIRES of sabotage, or otherwise expressing outrage at a missed citation Travis Brooks-Trieste

  38. Importance is helpful… • Importance shows that cite counting is useful (or at least used!) • Users of the information are motivated to help maintain it • SPIRES is almost open source • We help eliminate authors’ typos, they help eliminate our errors Travis Brooks-Trieste

  39. …helpful… • SPIRES can replace bad cites with the correct ones • Corrects our errors • Corrects author errors • Even helps limit propagation of errors • Ex: a Witten article with 1,300 cites had 100 incorrect cites, all the same typo Travis Brooks-Trieste

  40. …but also worrisome • Responsibility lies with the maintainers of the citation counts • Previously in the hands of referees and editors • Self-citation • Boost counts artificially • Deception • We have had it happen Travis Brooks-Trieste

  41. Citation Counts: Summary • We do it, and it works • Fast, Easy, and Fluid • Valued by the Community • It is more than imperfect • Relevance and Accuracy • Does not yet replace traditional peer review Travis Brooks-Trieste

  42. Part III What would it take to truly change peer review? Travis Brooks-Trieste

  43. To change peer review • Stakeholders in the peer review system • Editors • Referees • Authors • Readers • Fundamental differences between disciplines • hep-th and hep-ex are different in their adoption of eprints Travis Brooks-Trieste

  44. To change peer review • Functions of peer review when divorced from communication • One must replace (or discard) all of these • Metrics for papers • Metrics for scientists • Metrics for truth? Travis Brooks-Trieste

  45. Peer review = “good science” ? • Peer review gives a seal of approval • Laypeople • Medicine, Environmental Science, etc. • Refereeing process is filled with examples of weakness • Yet it feels fundamentally sound • Publishers have taken this role of “vetting” science Travis Brooks-Trieste

  46. Truth is more complex • Community acceptance determines scientific truth • “Yesterday’s sensation, today’s calibration” • The “test of time” is longer than the 6 month lag time for journal articles • Immediacy is needed for communication and conversation • But deliberation is needed for context and community judgment Travis Brooks-Trieste

  47. An Opportunity • Place an article in the context of the surrounding work • Reference linking only a baby step • Degree to which a finding has been verified or contradicted by earlier or later work • Ex: M. Peskin’s Topcites reviews at SLAC • The numbers are amusing • Context is the real value Travis Brooks-Trieste

  48. Context • Another Example: Particle Data Group • Reports data from all HEP experiments • Sorts and combines data • References to comments on validity • References to interpretations of the data Travis Brooks-Trieste

  49. PDG Example Travis Brooks-Trieste

  50. Opportunities • Intense scrutiny not possible for journals • Context is important • Amazon and google • Personalized and dynamic • Citebase • Torii Travis Brooks-Trieste

More Related