1 / 25

Reinventing Science Librarianship

Reinventing Science Librarianship. Education for New Roles Catherine Blake cablake@email.unc.edu http://www.ils.unc.edu/~cablake University of North Carolina @ Chapel Hill. Source: The DCC Curation Lifecycle Model. Creation. Jupiter has moons Galileo, Sidereus Nuncius, 1610

junior
Download Presentation

Reinventing Science Librarianship

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reinventing Science Librarianship Education for New Roles Catherine Blake cablake@email.unc.edu http://www.ils.unc.edu/~cablake University of North Carolina @ Chapel Hill

  2. Source: The DCC Curation Lifecycle Model

  3. Creation • Jupiter has moons • Galileo, Sidereus Nuncius, 1610 • Relative sizes of the Earth, Sun and Moon • Aristarchus's 3rd century BC • this image - 10th century AD Source: Wikipedia

  4. Creation • Little Dipper microarray processors • Biology/pharmacology • The first beam in the Large Hadron Collider at CERN1 was successfully steered around the full 27 kilometers of the world’s most powerful particle accelerator Source: http://www.scigene.com/products/little_dipper.html http://mediaarchive.cern.ch/MediaArchive/Photo/Public/2008/0809002/0809002_01/0809002_01-A5-at-72-dpi.jpg

  5. Acquisition & Collection • Data acquired directly from scientists • Heterogeneous formats • multi-media • annotations on a spreadsheet • Varying quality • experimental settings • Student vs verified data

  6. Collectively identifying resources Group think Social bookmarking Participatory cataloging Eg UNC photographs Identification & Cataloging

  7. Storage & Preservation • Storage • 92% on magnetic media • 5 exabytes of print, film, magnetic, and optical storage media produced about in 2002 • Preservation • Heterogeneous • Changing hardware • Changing software Image Source: http://www.cray.com/products/index.html http://www2.sims.berkeley.edu/research/projects/how-much-info-2003/

  8. Barriers to access removed • Environment • New source of information providers (Scientists, Granting agencies) • NIH Mandated access • Consequences • No single point of access • Different levels of access required • HIPPA compliance • Maintaining cultural norms

  9. Use and Reuse • Data and Text Mining • Use data collected for a different purpose • Eg a side-effect of one drug becomes the purpse of another • Information Synthesis • Combine speculative information • Literature Based Discovery • Uncover transitive connections from text

  10. Data Oriented Roles • Data Consultant • Share best practice regarding how to organize & share data • Data Distributor • Scientists control the data, distributor makes the data available to others • Data Manager • Manager organizes and keep the data

  11. New Roles • Data Service Provider • Data conversion and pre-processing • Data and Text Analyst • Scientist provides the data, analyst applies visualization, data and text mining tools. • Embedded Roles (Data Scientist) • Information Work flow

  12. Data Oriented Roles • Information organization • Conceptual Modeling • Create and understand • ER diagrams • UML diagrams • Concept maps

  13. Information Object interpreted using 1+ interpreted Data Representation using 1+ Object Information Physical Digital Object Object 1+ Bit Sequence Reference Model For an Open Archival Information System Source:nost.gsfc.nasa.gov/isoas/presentations/oais_tutorial_200005.ppt

  14. Data Oriented Roles • Conceptual  relational models • Good database design • Normalization • Methods to enforce • data quality • referential integrity • Ongoing maintenance

  15. New Roles • Text Mining: A case study • All text is not created equal • Things that in the way • Page breaks • Figures • Tables • Special characters • Implications to preservation

  16. Human readable form (PDF)

  17. Data Services – Case Study

  18. Machine readable form ></TABLE ><P >Scientists engage in the discovery process more than any other user population, yet their day-to-day activities are often elusive. … The development of accurate models often requires that a scientist resolve conflicting evidence.</P ><P >One activity that consumes much of a scientists' time is <I >synthesis</I >, <IMG SRC="/giflibrary/12/ldquo.gif" BORDER="0">the dialectic combination of thesis and antithesis into a higher stage of truth<IMG SRC="/giflibrary/12/rdquo.gif" BORDER="0"> (<I >Merriam-Webster's Collegiate Dictionary</I >, [<A HREF="#BIB24" >2004</A >]). This dictionary definition reflects the alternative viewpoints that often occur when multiple empirical studies explore the same phenomena. The synthesis activity results in an overall finding&nbsp;-&nbsp;a higher stage of truth&nbsp;-&nbsp;which scientists achieve by …

  19. First phase pre-processing ></TABLE> <P>Scientists engage in the discovery process more than any other user population, yet their day-to-day activities are often elusive. … The development of accurate models often requires that a scientist resolve conflicting evidence.</P> <P>One activity that consumes much of a scientists' time is <I>synthesis</I>, <IMG SRC="/giflibrary/12/ldquo.gif” BORDER="0">the dialectic combination of thesis and antithesis into a higher stage of truth<IMG SRC="/giflibrary/12/rdquo.gif“ BORDER="0"> (<I>Merriam-Webster's Collegiate Dictionary</I>, [<A HREF="#BIB24">2004</A>]). This dictionary definition reflects the alternative viewpoints that often occur when multiple empirical studies explore the same phenomena. The synthesis activity results in an overall finding&nbsp;-&nbsp;a higher stage of truth&nbsp;-&nbsp;which scientists achieve by … OLD: <IMG SRC="/giflibrary/12/ldquo.gif” BORDER="0"> NEW: ” OLD: <IMG SRC="/giflibrary/12/ldquo.gif” BORDER="0"> NEW: “ OLD: (Merriam-Webster's Collegiate Dictionary [<A HREF="#BIB24">2004</A>]) NEW: _BIB_24

  20. Second phase pre-processing • Add Identifiers • break paragraphs into sentences • Add document, section, paragraph, sentence IDs • Replacements • symbols , references • Output: Identifiers|One activity that consumes much of a scientists' time is synthesis “the dialectic combination of thesis and antithesis into a higher stage of truth” _BIB_24. Identifiers|This dictionary definition reflects the alternative viewpoints that often occur when multiple empirical studies explore the same phenomena.

  21. Clustering Categorization Association Rules IBM Intelligent Miner for text (Clustering) SAS Text Miner (Association Rules) Text Analytics

  22. Visualization NCI-funded research 1995-2001

  23. Embedded Roles

  24. Embedded Roles • Workflow • Deep understanding • Data formats • Access norms • Reward structures • Custom pre-processing

  25. Closing Remarks • Not everyone will have every skill • Existing skills that will remain critical • Strong ties to faculty • Strong negotiating skills • Knowledge of standards and resources • The roles exist, its not clear where they will live within an institution The ability to think like someone within a discipline

More Related