180 likes | 377 Views
Document Store Architecture. Presented to. Architecturally Significant Requirements. Support storage and retrieval of wide variety of XML formats including bibliographic records, authorities, etc (example MARC XML, ONYX) Store XML and non-xml content in native format
E N D
Document Store Architecture Presented to
Architecturally Significant Requirements • Support storage and retrieval of wide variety of XML formats including bibliographic records, authorities, etc (example MARC XML, ONYX) • Store XML and non-xml content in native format • Support full-text and fielded search (on fields such as Title, Author, Publication Date, etc) • Support adding new document types by configuration • Support linking of documents within same document type and across document types • Support check-out, check-in, locking and versioning of documents • Scalability up to 10 million documents on commercial hardware (scale vertically or horizontally)
Content Hierarchy Type Hierarchy Content Hierarchy nt:hierarchyNode nt:folder nt:file nt:linkedFile nt:resource
Content Hierarchy • ole:bibliographic> nt:folder • type(string) • created date (date) • ole:license> nt:file • publisher (string) • valid to (date) • ole:document > nt:file • artist (string) • release date (date) • ole:link> nt:resource • type (string) • uuid (string)
Content Hierarchy ROOT Bib Auth Books LOC ONYX MARC Subject Book 1 Book 2 History License ONYX
Content Hierarchy Auth Bib License Book 2 Book 1 LOC MARC Wiley Pearson link link link Edition 2
So Far.. • On boarded on 1st March • What was accomplished • Analyzed document store requirements • Identified the architecturally significant requirements • Evaluated the options and defined candidate architecture to be proofed • POC Development started on 14th March • Evaluated multiple UI Components for JCR front end • Setup JCR repository • Test ingestion, document hierarchy • TODO • Complete functional POC (all modes of ingestion, event listeners, Versioning, Linking) • Determine JCR UI component of choice • Complete scalability POC (Indexing, Discovery & Store) • Elaborate design
Recommendations • Span multiple streams to bootstrap development • Index / Search Stream • Doc Store Configuration Stream • Ingestion Stream • Document Management Stream • 1 Specialist developer / stream • Identify Doc Store Services required by other components • Define public services and develop proxies to address dependency issues • Implementations would be made available as development progresses • Define 3rd party services to be consumed • Code test 3rd party services independently prior to integration with Doc Store
Thank You World Headquarters 3270 West Big Beaver Road Troy, MI 48084, U.S.A Phone : 248.786.2500 Fax : 248.786.2515 Web : www.htcinc.com