70 likes | 164 Views
Two French Projects on Analysis of Cultural Heritage Documents. NAVIDOMASS NAV igation I n DO cument MASS es. MADONNE MA sses de DON nées issues de la N umérisation du patrimoi NE. Project Leader : Jean-Marc OGIER L3i Laboratory, la Rochelle University
E N D
Two French Projects on Analysis of Cultural Heritage Documents NAVIDOMASS NAVigation In DOcument MASSes MADONNE MAsses de DONnées issues de la Numérisation du patrimoiNE Project Leader :Jean-Marc OGIER L3i Laboratory, la Rochelle University Tel : 0033 5 46 45 82 15 – jean-marc.ogier@univ-lr.fr Mathieu Delalandre (CVC) IDoc Meeting, Valencia (Spain) 22th February 2007
Scope of projects … Calendar … Years 2003 2004 2005 2006 2007 2008 2009 Processing GUI High-Level Meta-Data of images Structured and Indexed Information MADONNE MAsses de DONnées issues de la Numérisation du patrimoiNE French ANR program “Masse de données” Length 36 months Funding 110 000 € Strategy Model System Cultural Document Images The cultural heritage documents correspond to a very large mass of data. The Madonne/NaviDoMass projects develop document analysis systems allowing to index and to browse inside this mass of data. NAVIDOMASS NAVigation In DOcument MASSes French ANR program “Masse de données et connaissances” Length 36 months Funding 550 000 € Introduction
Consortium Centre de Recherche en Informatique de Paris 5 (Paris) Institut de Recherche en Informatique et Systèmes Aléatoires (Rennes) Laboratoire Lorrain de Recherche en Informatique et ses Applications (Nancy) Laboratoire d'Informatique de Traitement de l'Information (Rouen) Laboratoire d’informatique image et interaction (La Rochelle) Laboratoire Informatique (Tours) Centre d’Etude Supérieures de la Renaissance (Tours) Laboratoire d'InfoRmatique en Image et Systèmes d'information (Lyon) 55 Project Members 20 Project Partners Permanent On the last 3 years
Collection Modelling [Journet’06] Text density Graphic density Old printed books Directional rose Overview Document Layout [Ramel’05] 10 000 pages of old printed books Bloc segmentation into footnote, text zone, dropcap, figure, .. Background analysis Foreground analysis Merging
Overview Document Layout and Retrieval [Couasnon’05] Query Text Field 60 000 Forms of XIX° Century Segmented Cells Form viewer Retrieved patronymic “access to form” Graphem based signature for handwritten patronymic retrieval (1) Line extraction based on Kalman Filter (2) Positioning Grammar to correct and build cells from extracted lines
Dropcap Retrieval [Parreti’05] [Uttama’05] [Delalandre’06] [Salmon’05] 10 000 dropcap images Frequency Style retrieval Pattern rank Structure retrieval textures MST image Printing retrieval query compacity RLE Accuracy combination of shape descriptors Letter retrieval image capital letter Overview Document Layout [Nicolas’06] Handwritten pages of XIX° century text erasure interline Segmentation based on Markov Random Field
Perspectives NaviDoMass started since November 2007 … • 5 Work Package (WP) • Document Layout analysis and structure based indexing • Information spotting • Structuring the feature space • User needs, participative design and groundtruthing • Interactive extraction and relevance feedback WP related to MADONNE New topics Conclusion Results Consortium 8 laboratories, 55 members 76 Publications http://l3iexp.univ-lr.fr/madonne/publications.html 33 Softwares http://l3iexp.univ-lr.fr/madonne/ressources.html Renew of project NaviDoMass