1 / 25

Effectiveness of tagging laboratory data using Dublin Core in an electronic scientific notebook

Effectiveness of tagging laboratory data using Dublin Core in an electronic scientific notebook. Laura M. Bartolo 1 , Cathy S. Lowe 2 , Austin C. Melton 3,4 , Monica Strah 5 , Louis Feng 3 , Christopher J. Woolverton 5

rhett
Download Presentation

Effectiveness of tagging laboratory data using Dublin Core in an electronic scientific notebook

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Effectiveness of tagging laboratory data using Dublin Core in an electronic scientific notebook Laura M. Bartolo1, Cathy S. Lowe2, Austin C. Melton3,4, Monica Strah5, Louis Feng3, Christopher J. Woolverton5 1College of Arts & Sciences, 2 School of Library and Information Sciences, 3Department of Computer Science, 4Department of Mathematics, 5Department of Biological Sciences Kent State University Thursday, August 29, 2002

  2. Scientific Notebooks • record and document daily in-house activities • manage research results • Objectives of scientific electronic notebook project • provide high quality resource discovery and retrieval for primary data objects • adapt for a multidisciplinary, biotechnology research laboratory • Current Work: one stage of a multi-stage project • record, store, and manipulate • multidisciplinary, multi-institutional scientific information • raw data to finished research papers • Today’s presentation • laboratory data early in the scientific process • prototype modified relational database • Dublin Core metadata

  3. Long-term Project Goals • Organize biotechnology information to encourage multidisciplinary use of information; • Apply knowledge gained and tools developed to the organization of other scientific information; and • Make scientific information from different disciplines more accessible

  4. Biotechnology Research Team • Interdisciplinary, multi-institutional research team: • Department of Biological Sciences • Liquid Crystal Institute at Kent State University, • Northeastern Ohio Universities College of Medicine • Summa City Hospital • Workplace and research needs • collectively conceive new ideas • prevent redundancy • exchange results, write papers • perform daily activities in different physical locations

  5. Tagging Data with Dublin Core Metadata • Link data across users and institutions • individual data, • analogous data types • non-chronological data entries • Enable the retrieval of laboratory data • facilitatedata acquisition and analysis • share data more readily • Ensure reliable quality control • enhanced data integrity • better data analysis

  6. Immediate Research Questions: • Is it feasibly to use DC to describe laboratory data (i.e., does it adequately capture necessary information)? • Does DC adequately support functions required in a laboratory environment?

  7. Advantages of Dublin Core: • Concise, simple, and easy to learn • Scientists & staff: little time to create metadata records • Supports multiple formats • Scientific laboratory:text, still images, video, audio, and datasets • Facilitates Internet resource discovery • Scientific data: rapidly and widely available • ApprovedANSI/NISO standard • Seeking approval as international standard: resource discovery and information exchange

  8. Scientific NotebookDatabase Design • Main Notebook is organized hierarchically and includes: • Topic: descriptions of past, present and planned projects; • Experiment Goals: experiment design concepts for a given project • Materials & Methods: procedures and materials used in experiments • Experiment Results/ Materials & Methods Results: data tables, graphs, images and datasets • Topic Results: Results would include drafts and finished papers

  9. Topic T/N&As Experiment Goals (6) Topic Results EG/N&As (1) Mat’ls & Methods (32) Experiment Results (2) MM/N&As (42) Materials & Methods Results (119) Main Notebook

  10. Supplementary Tables • Materials: detailed information such as MSDS (material safety data sheets), specification sheets, and Materials Lot Analyses about organisms and liquid crystal substances involved with experiments. • User: basic contact information about individual researchers involved with the scientific investigation specifing authentication and access rights. • Memos: entities such as correspondence, equipment issues, and notes for future experiments.

  11. Dublin Core Records Users Main Notebook Memos Materials Supplementary TablesDesign

  12. Title = "Bacterial Toxicity Assay of CPCl treated Klebsiella pneumoniae" Creator = "Woolverton, Christopher J." Subject= “Cetylpyridinium” (MeSH) Subject= “Klebsiella pneumoniae” (MeSH) Subject= “Toxicity Tests” (MeSH) Subject= “Biological Assay” (MeSH) Subject= “Bacterial Toxins” (MeSH) Description = "Graph of Bacterial Toxicity Assay of CPCl treated Klebsiella pneumoniae. % live standard curve used to evaluate CPCl effects." Date = " Created 2000-09-06" Type = "image" Format = "image/jpg 183 KB" Identifier = "CJW2_043001.JPG" Identifier = "Materials and Methods Result #39" Language = "en-US" (rfc1766) Relation = "IsPartOf Materials and Methods #18" Source = "topic #4" Image DC record

  13. LC Image DC record Title = " Slide 'B' of strepavidin-bead: anti-strepavidin AB Dose Response Assay" Creator = "Woolverton, Christopher J." Subject= “Streptavidin” (MeSH) Subject= “Polymers” (MeSH) Subject= “Microspheres” (MeSH) Subject= “LC-5” (uncontrolled) Description = " % BEADS= 1.25; AB Concentration 100. Big aggregates, occasional isotropic regions, phase shift around larger eggs. Birefringence 5." Date = " Created 2000-10-17" Type = "image" Format = "image/jpg 67 KB" Identifier = " 101700\B 10x.jpg" Identifier = "Materials and Methods Result #100" Language = "en-US" (rfc1766) Relation = "IsPartOf Materials and Methods #35" Source = "topic #4"

  14. Title = "Culture purity confirmed via streak plate." Creator = "Woolverton, Christopher J." Subject= "Escherichia coli" (MeSH) Subject= "Klebsiella pneumoniae" (MeSH) Subject= "Pseudomonas aeruginosa" (MeSH) Subject= "Biological Assay" (MeSH) Subject= "Bacterial Toxins" (MeSH) Subject= "Toxicity Tests" (MeSH) Description = "All three organisms grew well on nutrient agar, 37, overnight. E. coli and Klebsiella grew single colonial morphologies, Pseudomonas grew as two colonial types; a small + larger colony type as expected. Prep to repeat BacLight Assay - E. coli + K. P" Date = " Created 2000-08-24" Type = "text" Format = "ascii" Identifier = "Materials and Methods Attachments #38" Language = "en-US" (rfc1766) Relation = "IsPartOf Materials and Methods #17" Source = "topic #4" Written notebook entry 240800 Culture purity confirmed via streak plate. All three organisms grew well on nutrient agar, 37, overnight. E. coli and Klebsiella grew single colonial morphologies, Pseudomonas grew as two colonial types; a small + larger colony type as expected. Prep to repeat BacLight Assay - E. coli + K. Pneumo sub'ed to NB (30 ml). Text DC record

  15. Distribution of DC descriptions:202 (88.5%) of the 228 records

  16. Problems Encountered: Involved descriptions of higher-level records. • type designation for records: "text" or "collection" • "text" was used. • subject element: extent and level of detail that should be provided • relation element:provide a link between Material and Methods notes and results • not used

  17. DC Records: Computerized Content Analysis • Use frequency of elements for different types of information objects • Total and unique instances of elements included in frequency counts • DC records visually analyzed to identify any nonstandard DC usages • Good fit between the metadata schema and the data exists if DC elements follow with established standards • Indicate ability of DC element set to be applied to laboratory data as information objects

  18. DC Element Use Frequencies:

  19. DC Element Qualifier Use Frequencies

  20. Observations: • 11 of the 15 DC elements may be qualified: title, subject, description, date, type, format, identifier, source, language, relation, and coverage. • at least one qualifier was used for those five elements: subject, date, type, language, relation

  21. Qualified vs. Unqualified Subject Element Frequencies

  22. Observations: • "MeSH" Medical Subject Headings (NLM 2002) qualifier was used with the subject element in the 677 occurrences an average frequency of occurrence of 3.351 • Twenty-six unique instances • MeSH" very useful in describing the organisms and microbiological techniques used in the laboratory • 166 occurrences (average frequency 0.822) • Unqualified subjects contained keywords generated by the persons entering the metadata • 154 occurrences describe different types of liquid crystals • MeSH general descriptor "Polymer" applied to liquid crystals

  23. Functionality of the DC elements • Examine four specific types of metadata classes: • discovery, use, authentication and administration (Greenberg 2001) • aggregate content analysis element frequencies for each class • identify DC’s effectiveness regarding specific functions among data types

  24. DC Schema’s Ability to Sustain Required Information Functions

  25. Preliminary Conclusions: • Further database development needed • XML database • Small amount of data in the database • Enlarge lab participation • It is feasible to use DC to describe lab data. • many elements used once in each DC record • DC supports functions required in a lab setting. • each function represented reasonably well

More Related