90 likes | 201 Views
Document Store - Pilot 001. Presented to. Objectives. Index 5M+ MARC XML records Demonstrate following features Full-text search Advanced search (fielded search) Search results pagination Sub second query time on commercial hardware Setup Jackrabbit repository ( MySQL persistent store)
E N D
Document Store - Pilot 001 Presented to
Objectives • Index 5M+ MARC XML records • Demonstrate following features • Full-text search • Advanced search (fielded search) • Search results pagination • Sub second query time on commercial hardware • Setup Jackrabbit repository (MySQL persistent store) • Load up to 5000 documents • Analyze and optimize loading & storage • Generate UUID • Check-in, Check-out and versioning • Establish links between documents
Environment • Hardware • CPU – Quad Core @ 2.93 GHz • Memory – 16 GB • Storage – 500GB • Software • 64 Bit Windows 7 OS
Performance Metrics • Indexing time for 5362832 (~5.5M) records is 1 Hour and 42 Minutes • Index size for 5362832 records is 14GB • Extrapolated indexing time for 10M records is ~3 hours • Loading time for 3569 records 112 seconds • Extrapolated loading time for 6M records is 55 hours (~2.31 days) • Average response time for full-text search 69 milliseconds • Average response time for advanced search 3+ fields 200 milliseconds • Note: Basic setup with minimal or no tuning
Work in Progress • Faceted navigation and search suggest • Simultaneously index and search multiple document types • Index and search new document types by configuration • Batch and online management (add, update, delete indexes) • Repository document load, 5M documents • Discovery and Repository integration • Bulk and online operations load, update
Thank You World Headquarters 3270 West Big Beaver Road Troy, MI 48084, U.S.A Phone : 248.786.2500 Fax : 248.786.2515 Web : www.htcinc.com