390 likes | 527 Views
Graphics Recognition – from Re-engineering to Retrieval. Karl Tombre, Bart Lamiroy LORIA, France. Document Analysis in the IR era. Information is at the core of industrial strategies A lot of digital or digitized information, but often in very “poor” formats
E N D
Graphics Recognition – from Re-engineering to Retrieval Karl Tombre, Bart Lamiroy LORIA, France
Document Analysis in the IR era • Information is at the core of industrial strategies • A lot of digital or digitized information, but often in very “poor” formats • The challenge: not necessarily re-engineering of documents, but enrich poorly structured information, add (limited) amount of semantics, build indexes • Purposes: browsing, navigation, indexing • DAR methods and tools useful, but must be adapted
Specific challenges of large-scale IR applications • Genericity: we cannot necessarily build a complete and exhaustive a priori model of contextual knowledge (ontology) • Adaptability: various input data – scanned paper, PDF, DXF, HTML, GIF… – various resolutions • Robustness: “back-office” applications • Efficiency: online searching in heterogeneous data • Scaling: methods have to scale to increasing number of symbols/features
DAR and IR • Media without (or with very little) contextual knowledge • Image-based indexing and retrieval, indexing of video sequences • Documents do explicitly convey information from one person to another person • Much more structure, syntax and semantics
DAR and IR – some examples • Indexing and/or searching scanned text without OCR • Similarities, signatures • Query or index on layout structure • Table spotting • Keyword spotting • …
What about Graphics Recognition? • Subfield of DAR, for graphics-rich documents • Numerous methods for various analysis and recognition problems • Raster-to-vector conversion • Text/graphics separation • Symbol recognition • Many specific technical areas: maps, architectural drawings, engineering drawings, diagrams and schematics, …
Graphics recognition methods • Text/graphics separation
Graphics recognition methods • Vectorization
Graphics recognition and IR applications • Usual text-based indexing and retrieval still useful • But need for access to other kinds of information: • Symbols • Text-drawing connections • Description-illustration connections
Some contributions • Syeda-Mahmood – maintenance drawings IEEE Trans. On PAMI 21(8):737-751, Aug. 1999
Some contributions • Arias et al., Najman et al. – use of information contained in legend / title block Proc. GREC’01, Kingston (Ontario, Canada), p.19-26, Sept. 2001
Some contributions • Samet & Soffer – symbols from legend IEEE Trans. On PAMI 18(8):783-798, Aug. 1996
Some contributions • Müller & Rigoll – graphical retrieval in database of engineering drawings Proc. ICDAR’99, Bangalore (India), pp. 697-700, Sept. 1999
Some contributions • Boose et al. (Boeing) – Generation of Layered Illustrated Parts Drawings (GREC’ 03) Proc. GREC’03, Barcelona, pp. 139-144
Symbol DB Or even better… Wishful thinking?
Symbol recognition Before we move on: 1st contest on symbol recognition held last week See IAPR TC10 homepage for further details • Natural features for indexing and retrieval • Most methods work with known databases of reference symbols – what about interactive querying of arbitrary symbols? • From segmentation followed by recognition, to segmentation-free recognition, or segmenting while recognizing • Scalability • Efficiency / complexity • Discrimination power • Signatures
Image-based signatures • Compute invariant signatures on binary document image • F-signatures (ICDAR’01) • Radon transform: R-signatures [Tabbone & Wendling] • Ridgelets [Ramos Terrades & Valveny – GREC’03] – aka wavelet transform of Radon transform
R-signatures Detection of arrowheads [Girardeau & Tabbone] DEA degree thesis, INPL, Nancy, Jul. 2002
R-signatures Another example [Girardeau & Tabbone]
Ridgelets [Ramos Terrades & Valveny – GREC’03] Proc. GREC’03, Barcelona, pp. 202-211
Vector-based signatures [Dosch & Lladós – GREC’03] • Based on set of basic graphical features: • Parallelism • Overlap • Collinearity • T- and V-junctions • Quality factor associated with the various relations • Match signatures of reference symbols with signatures of buckets
Vector-based signatures Proc. GREC’03, Barcelona, pp. 159-169
Towards symbol spotting • Pre-compute – or compute on the spot – a set of basic signatures • Can be sufficient for symbol spotting and retrieval • Followed by classical symbol recognition if more discrimination is needed
Symbol spotting • [Jabari & Tabbone] : graph matching through probabilistic relaxation, with nodes=segments and vertices=relations DEA degree thesis, INPL, Nancy, Jul. 2003
Symbol spotting • [Jabari & Tabbone] : another example
Combining Text and Graphics • Extracting Text/Graphics relationships within document • Using Text matching for inter-document relationships • Transitive inter-document Graphics matching • No need for complex graphics matching • Restricted to well known document types
Example: continuation of Wiring Diagrams (Boeing) • [Baum et al. – GREC’03] Proc. GREC’03, Barcelona, pp. 132-138
Scan2XML Example Proc. GREC’01, Kingston (Ontario, Canada), pp. 312-325
Indexing and Semantics • Signature + metric • Semantics = measured distance to signature • Applies only to homogenous contexts • Pre-segmented images • Pre-determined image classes • Implicit application of domain kowledge • ... • Semantics = Syntax
Example Signature type A Metric M Signature value l Semantics1 = (1, 1) Semantics2 = (2, 2) M(l,s1) < m1 ? M(l,s2) < m2 ? semantics = measurement to reference value
Heterogenous Document Bases • Semantics do not have a unique syntax anymore • Syntax metrics may be context sensitive • Semantics = Syntax + Context Context needs to be considered
Example Context 1: Signature type A Metric M Context 2: Signature type B Metric N Signature value l What if M(l,s1) < m1and N(l,t2) < n2 ? (1, 1) = Semantics1 = (t1, n1) (2, 2) = Semantics2 = (t2, n2)
Data Data Data (syntax) (semantics) (semantics) A step to taking into account context (while consolidating existing approaches) Component Algebra : • Image Analysis = Pipeline • Syntax + algorithm = semantics Algorithm Algorithm Syntax and semantics need not be distinguished
Component Algebra • Components : Known and implemented document analysis algorithms, taking input data from one domain, and producing data into another domain. • Application Context : Set of all available Components. • Semantics : Data sets needed by or produced by Components.
Component Algebra is a Graph Data Component Data Data Component Data Data Component Data Data
Advantages • Each node is a semantic concept, semantic relationships are explicitly expressed. • Structure may support automatic reasoning and knowledge inference. • Context is embedded in components, different contexts give different paths in the graph. • Highly scalable and open architecture. • Bridge between signal-level document analysis and high-level document representation.
However ... The formalism exists, the realization doesn't (yet) • What about parametrization ? • How context independant can you get ? • What about « guessing » context appropriateness ? • How to design fully interoperable components ?
Conclusion • A lot of DA methods – and more specifically GR methods – can be of direct use in IR, indexing and browsing applications • Specific challenges • Scaling and efficiency • Heterogeneous sets of documents • Incomplete domain knowledge • Symbol spotting • On-the-fly symbol searching • Sketch of open framework for including document semantics when context can be heterogeneous