230 likes | 415 Views
COLLATE – Collaboratory for Annotation, Indexing and Retrieval of Digitized Historical Archive Material DELOS International Cooperation Workshop , May 30, 2003. Ingo Frommholz Fraunhofer IPSI, Darmstadt frommholz@ ipsi.fraunhofer.de http://ipsi.fraunhofer.de/.
E N D
COLLATE – Collaboratory for Annotation, Indexing and Retrieval of Digitized Historical Archive MaterialDELOS International Cooperation Workshop, May 30, 2003 Ingo Frommholz Fraunhofer IPSI, Darmstadt frommholz@ipsi.fraunhofer.de http://ipsi.fraunhofer.de/
Digital Libraries in Cultural Heritage • Valuable historic document collections exist, but are scattered in national archives • Sources mostly not available online • Difficult-to-use database & referencing systems • Lack of content-based indexing & access • Valuable expert domain knowledge exists, but mostly inaccessible to externals • Tacit knowledge, insufficiently documented • Professional communities lack technology support for collaborative knowledge working DELOS International Cooperation Workshop Prague, May 30, 2003
The COLLATE Project (IST-1999-20882) • Constructing a “Collaborative Information Space” • Preserve historic documents in a distributed multimedia repository • European historic film documentation (20ies and 30ies) • Historic film censorship (legal docs, applications & decisions, correspondence, etc.), Press material (articles), Photos (stills, portraits) & film posters, Digital film/video fragments • XML metadata (cataloguing & content indexing) • Ensure accessibility • Work environment for content indexing & annotation • Content- and context-based retrieval • Evaluate acceptability • Preservation case studies by film experts • Empirical studies of real-life user behavior DELOS International Cooperation Workshop Prague, May 30, 2003
Partners Content providers / pilot users Deutsches Filminstitut – DIF, Frankfurt, Germany Filmarchiv Austria, Vienna, Austria Národní Filmový Archiv, Prague, Czechia Technology developers Fraunhofer IPSI, Darmstadt, Germany University of Bari, LACAM Lab, Bari, Italy Sword ICT S.r.l., Bari, Italy Evaluation partner Risø National Laboratory, Systems Analysis Dept, Denmark DELOS International Cooperation Workshop Prague, May 30, 2003
Why a Cultural Collaboratory? • Support existing work processes in cultural sciences • Interpretative content analysis of documents • Reconstruct „unity“ of cultural phenomena, interlinking scattered knowledge sources • Offer new knowledge working environment • Organize collaborative work • Bring together divergent user communities & roles • Create enhanced cultural information services • Raise awareness & visibility of cultural archives DELOS International Cooperation Workshop Prague, May 30, 2003
Censorship / Registration Cards DELOS International Cooperation Workshop Prague, May 30, 2003
Newspaper Articles DELOS International Cooperation Workshop Prague, May 30, 2003
Conceptual Integration COLLATE-Ontology Collate Entity Generic Level ABC-Model Location Temporality Abstraction Actuality Cultural Heritage Do-main Level CIDOC CRM, FRBR Manifes-tation Work Situation Action Event Agent Film Archive Subdomain Level: LC TGM II FIAF Classification Form, Genre Physical Cha-racteristics Film Situation Moving Image Film Event Film Activity Film Agent COLLATE Appli-cation Level: Collate Keywords Film Censor-ship Event Film Cen-sorship Activity Film Censorship Agent Censorship Document Film- and Censorship Topic DELOS International Cooperation Workshop Prague, May 30, 2003
Model of the Concept „Film Life Cycle“ hasParticipant hasParticipant cencorshipdecisionx Directingx hasAction hasAction shortedversionx censor-shipx filmcreationx originalversionx precedes precedes follows has Result involves has Result Filmcopy Bx Filmcopy Ax Work x realizesWork realizesWork DELOS International Cooperation Workshop Prague, May 30, 2003
System Architecture (OAIS) DELOS International Cooperation Workshop Prague, May 30, 2003
external internal (COLLATE system) traditional - meetings - phone- mail - email computer-supported/online discussion forum implicit explicit specified relation types communication(e.g. requests) about: annotation interlinking terminologydevelopment cataloguing indexing Collaboration in COLLATE DELOS International Cooperation Workshop Prague, May 30, 2003
Annotation 1 Annotation 2 Annotation 3 Annotation 4 Annotation 5 Discourse Structures • “Discourses represent extended communication between two or more participants in a shared context.” (Rich & Sidner, 1998) • Establishing a discourse context • Modeling discourse as interrelated nested annotations • Annotation thread reflects scientific discourse • Typed links (DSR) between • Document and annotation • Annotation of annotations DELOS International Cooperation Workshop Prague, May 30, 2003
elaboration analogy comparison difference cause interpersonal background information interpretation support argument argumentation counterargument Communication Acts: Discourse Structure Relations DELOS International Cooperation Workshop Prague, May 30, 2003
Semantic Web Integration – COLLATE RDF(S) DELOS International Cooperation Workshop Prague, May 30, 2003
Document Retrieval in COLLATE For a query q, a ranking of documents is returned. Therefore, a retrieval weightris calculated for each document. Documents are ranked according to descending retrieval weights The retrieval is based on the document’s metadata (given by film scientists or extracted from the digitized documents) and on the annotation thread. DELOS International Cooperation Workshop Prague, May 30, 2003
Context-based Retrieval in COLLATE In COLLATE, we deal with the discourse context. A document is seen in the light of its interpretations We also consider at which point of the discourse a statement is made and what relation exists between the statement and the entity this statement refers to. Example: Consider the query for “all censorship decisions made for political reasons”. DELOS International Cooperation Workshop Prague, May 30, 2003
Query: “censorship decisions for political reasons”:Metadata Only ... <controlled_keyword> obscene actions </controlled_keyword> ... I think the reasons mentioned here are not the real reasons. I see a political background as the main reason. 0.01 Keyword Inter- pretation Counterargument Document Interpretation I disagree. There were a lot of similar decisions with the same argumentation. Of course, there might be a political background, but I think this is not the main reason in this case. Cataloguing ... <assessors_chairman> Oberregierungsrat Dr Becker Beisitzer: Justizrat Dr. Rosenthal... </assessors_chairman> ... ... <filmtitle> Kuhle Wampe </filmtitle> ... DELOS International Cooperation Workshop Prague, May 30, 2003
Query: “censorship decisions for political reasons”:Metadata + Interpretation ... <controlled_keyword> obscene actions </controlled_keyword> ... I think the reasons mentioned here are not the real reasons. I see a political background as the main reason. 0.32 Keyword Inter- pretation Counterargument Document Interpretation I disagree. There were a lot of similar decisions with the same argumentation. Of course, there might be a political background, but I think this is not the main reason in this case. Cataloguing ... <assessors_chairman> Oberregierungsrat Dr Becker Beisitzer: Justizrat Dr. Rosenthal... </assessors_chairman> ... ... <filmtitle> Kuhle Wampe </filmtitle> ... DELOS International Cooperation Workshop Prague, May 30, 2003
Query: “censorship decisions for political reasons”:Analysis of Discourse Structure Relations ... <controlled_keyword> obscene actions </controlled_keyword> ... I think the reasons mentioned here are not the real reasons. I see a political background as the main reason. 0.19 Keyword Inter- pretation Counterargument Document Interpretation I disagree. There were a lot of similar decisions with the same argumentation. Of course, there might be a political background, but I think this is not the main reason in this case. Cataloguing ... <assessors_chairman> Oberregierungsrat Dr Becker Beisitzer: Justizrat Dr. Rosenthal... </assessors_chairman> ... ... <filmtitle> Kuhle Wampe </filmtitle> ... DELOS International Cooperation Workshop Prague, May 30, 2003
COLLATE – User Interface DELOS International Cooperation Workshop Prague, May 30, 2003
Current State A first prototype was delivered to the archives and is used by them A second prototype will be delivered soon, introducing discourse structure relations and advanced collaboration features to the users A third prototype will contain context-based retrieval DELOS International Cooperation Workshop Prague, May 30, 2003
Outlook Evaluate collaborative approach and context-based retrieval Apply COLLATE technology in other domains? DELOS International Cooperation Workshop Prague, May 30, 2003
More information? http://www.collate.de DELOS International Cooperation Workshop Prague, May 30, 2003