10 likes | 150 Views
WP1 : Segmentation and Redundancy analysis. WP1.1. WP1.2. WP1.3. Layout AGORA. Segmentation of char, words, lines, punctuation, accents, …. Clustering techniques. Images Users. WP1.1.1. WP1.3.1. Physical Layout (Alto) Text, graphics, lines, words, chars. Redundancy (XML)
E N D
WP1 : Segmentation and Redundancy analysis WP1.1 WP1.2 WP1.3 Layout AGORA Segmentation of char, words, lines, punctuation, accents, … Clustering techniques Images Users WP1.1.1 WP1.3.1 Physical Layout (Alto) Text, graphics, lines, words, chars Redundancy (XML) Clusters (1 fichier XML/cluster,Bbox) + Page statistics WP2 : Segmentation and Redundancy Exploitation WP2.1 WP2.4 WP2.2 WP2.3 Cluster revision & Typo analysis Visualization, interaction RETRO Cluster recognition (OCROpus, …) Spotting Knowledge (Dictionary, Lexicon, …) WP2.5 WP2.5 Transcription, indexation, meta-data (TEI, ALTO, …) Learning dataset & Fonts (XML, Images, …)