1 / 8

Working Group: DFT - some use cases - Peter Wittenburg, Raphael Ritz

Working Group: DFT - some use cases - Peter Wittenburg, Raphael Ritz. Use Case: CLARIN Collection Builder.

fawzi
Download Presentation

Working Group: DFT - some use cases - Peter Wittenburg, Raphael Ritz

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Working Group: DFT- some use cases -Peter Wittenburg, Raphael Ritz

  2. Use Case: CLARIN Collection Builder Collection building across repositories is essential for running analysis. PhD students f.e. create their collection which exists of a MD object having a PID and storing many PIDs to refer to its components. Such a collection is an aggregation, but has an identity, can be cited, etc. PID data MD MD data collection builder data MD MD data data MD MD data data MD MD data MD+ data MD MD data repository 2 repository 1

  3. Use Case: Replication in EUDAT Data Domain community 1 community 2 community 3 replicating data from different types of data organizations from different communities = creating a coherent domain of data is not trivial data MD data MD data PID data MD+ MD+ data MD+ data EUDAT data domain

  4. Use Case: EUDAT Data Domain urgently need PID Info Types: • cksm • cksm type • data_URL+ • metadata_URL+ • ROR_flag • mutability_flag • access_rights_store • etc.

  5. Use Case: Curate Gappy Sensor Data in Seismology sensing p(t-5) p(t-4) p(t-3) p(t-2) p(t-1) p(t) real-time data due to limited delay, gappy data since referred data has gaps, dynamic data since gaps are being filled at random times real-time p(t-5) p(t-3) p(t-1) p(t) receiving p(t-4) p(t-2) packaging completing window (t1,k) window (t1,k+1) analyzing, referring window (t2,k) window (t1,k+3)

  6. Use Case: Crowd Sourcing Data Management buffering, chunking annotating intermediatestore stream data metadata database annotations Curation Structuring PID system Curation/Structuring - create proper MD with all relations - register PIDs for all objects permanent store stream data metadata annotations analysis

  7. Use Case: Language Technology Workflows collection collection collection processing module k processing module l processing module m metadata metadata metadata metadata k metadata k metadata l PID system processing modules are service objects

  8. Use Case: Language Technology Workflows processing module x read collection MD read MD interpret MD & get PID get data process data & create new data register PID create MD* update collection if more in collection go 2 end • Typical processing modules are: • tokenize texts • tag part of speech in texts • parse (shallow) of texts • recognize named entity in texts • filter speech • detect voiced segments • detect speaker of segments • do rough classification of segments • etc.

More Related