1 / 44

Data Exchange and Conversion Utilities and Tools (DExT)

Data Exchange and Conversion Utilities and Tools (DExT). Angad Bhat, Herve L‘Hours, Louise Corti UK Data Archive. An exchange format for qualitative data. Data exchange models and data conversion tools for primary research data collected in the course of qualitative research.

hop
Download Presentation

Data Exchange and Conversion Utilities and Tools (DExT)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Exchange and Conversion Utilities and Tools (DExT) Angad Bhat, Herve L‘Hours, Louise Corti UK Data Archive

  2. An exchange format for qualitative data • Data exchange models and data conversion tools for primary research data collected in the course of qualitative research. • A standard format for representing richly encoded qualitative data

  3. Applications of formats and standards for UKDA • Long-term preservation requirements (software and platform independent formats) • In-house toolsets for preparing qualitative data for multiple forms of dissemination • Enable ‘added-value’ data to be retained – software-specific functionality • Offers a standard for data creators to store and publish data in multiple formats eg common web-based publishing and search tools e.g ESDS Qualidata Online • More precise searching/browsing of archived qualitative data beyond the catalogue record • Facilitates annotated data exchange and data sharing across dispersed collections and repositories (comparative analysis and e-science) 3

  4. Needs and Advantages • Long-term preservation requirements • Vendor Specific Functionality • Analysis of Legacy Data • Vendor-Specific Markup via 3rd Party Tools • Metadata Enrichment of Resources • UKDA leading social science data archive. Only one archiving and sharing qualitative data on a national scale with a dedicated infrastructure

  5. DExT Project • JISC funded under Repositories Programme • Small budget for one year – proof of concept • Developing, refining and testing models for data exchange for qualitative research data based on METS schema - XML/RDF • Test data selected are from the social sciences (multimedia, linked, annotated data etc.), but these formats are typically found across all domains of primary research

  6. Which XML schema • The selected output format chosen for DExT is the Metadata Encoding and Transmission Standard (METS) which serves to both describe the structure and to package all the files relating to a study • METS – Metadata Encoding and Transmission Standard • is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using the XML schema language • The standard is maintained in the Network Development and MARC Standards Office of the Library of Congress, and is being developed as an initiative of the Digital Library Federation

  7. METS • Enables pointers to existing XML schema in use to describe a study, project, file, extract or say, annotation • Dublin Core • Text Encoding initiative (TEI) • Data Documentation Initiative (DDI) • QDIF • Triple S • Anything else relevant e.g ethno-methodological level annotation • METS Navigator may allow browsing of all objects through a standard web browser but an alternative object browser may need to be built

  8. Progress so far • Produced: • Comparison of relevant metadata/data schema • Overview and Use Case Analysis document • GUI Functional Specification for File Conversion & Metadata Enrichment • GUI front end • Meeting with software vendors tonight for feedback

  9. Data Constructs replicated in DExT • Identify Subsets of the study (e.g. Text or Line selections: Quotation/chunk concepts) • Assign Values to a Subset of a study (e.g. Keywords: Codes/nodes concept) • Create a Value Hierarchy (e.g. Keywords or Codes arranged in a coherent hierarchical structure: tree/hierarchy/SuperCodes concept) • Create a File Hierarchy/ file classification (e.g. Files arranged in a coherent hierarchical structure : family/texts/documents concept) • Assign Notes (e.g. Comments or Notes: Memos concepts)

  10. Overview of Implementation in METS • METS Profile • The DExT-METS standard will be fully documented as a METS profile providing detailed instructions on best practice. • The DExT-METS Generator GUI will also help in generating the DExT-METS file from scratch using the Manual process. • Metadata Enrichment • The DExT-METS Generator GUI will enable the extensive enrichment of the platform-independent versions of vendors markup with standards compliant XML. (TEI & DC) • Subsets in METS • METS start and end point (Audio and Video) • QDIF values (Text and Image)

  11. Overview of Implementation in METS • Structural Maps • These are used to split a study in any way; the usual example is by chapter and page. Each split is identified by a <div> tag. • CAQDAS: are constructed for Values, Value Hierarchies and File Hierarchies • Logical and Physical: Logical (by section) and Physical (file by file) Structural maps provide a mechanism by which 3rd party programs can access the whole of the original study as well as the vendor-specific markup. • Content Packaging • In addition to a DExT-METS version of the core data concepts the METS file (METS File Section) may also retain : • Original files from the study • Any rtf format versions created for analysis • Original vendor-specific xml file describing the resource • Any report output from the vendors program • Any supporting documentation, notes or content delivered with the study but not part of the core deliverables

  12. Technical Approach • Relational or XML indexing back end • Session-based access to studies • Online access to conversion tools • Batch processing of studies • Collaboration on development of tools (via SourceForge)

  13. Feedback • How suitable is the basic model • Level of buy-in • Cost implications of export tools to DExT-METS • 3rd party creation of tools to transform vendor XML output to DExT-METS? Conversion tools must be open source • Mechanism for feedback on model and technical possibilities • Next step throw open to METS community, repositories and sister data archives

  14. Compliant DExT-METS dext-mets.xml

  15. Identifying Subsets from the study (Quotation/chunk concepts)

  16. Assign Values to Subsets (Codes/nodes concept)

  17. Create a value hierarchy (tree/hierarchy/SuperCodes concept)

  18. Create a file hierarchy (family/texts/documents concept)

  19. CAQDAS Maps Back

  20. Audio/Video Subsets Back

  21. Text Subsets Back

  22. Image Subsets Back

  23. DExT-METS Generator GUI Next

  24. Atlas.ti conversion to DExT-METS Next

  25. QDA Miner conversion to DExT-METS Back

  26. Text Encoding Initiative for METS Next

  27. File Description Elements and Attributes Next

  28. Encoding Description Elements and Attributes Next

  29. Profile Description Elements and Attributes Next

  30. Revision Description Elements and Attributes Back

  31. Dublin Core for METS Back

  32. METS File Section Next

  33. Text Files Group Next

  34. XML Files Group Back

  35. Manual DExT-METS Process Steps Next

  36. Manual DExT-METS Process Steps Back

  37. DExT-METS Output // Namespaces for DExT-METS File // METS Header for DExT-METS File

  38. DExT-METS Output // TEI Header for DExT-METS File

  39. DExT-METS Output // TEI Header for DExT-METS File // DC Elements for DExT-METS File

  40. DExT-METS Output // QDIF Elements for DExT-METS File

  41. DExT-METS Output // File Section for DExT-METS File

  42. DExT-METS Output // CAQDAS Maps for DExT-METS File : Codes-Quotations

  43. DExT-METS Output // CAQDAS Maps for DExT-METS File : SuperCodes // CAQDAS Maps for DExT-METS File : SuperCodes

More Related