220 likes | 350 Views
The Only Operational Database Technology for Mission-Critical Big Data Applications. Paul Preuveneers – Principal Technologist Lee Pollington – Principal Consultant. Agenda. Big Data and MarkLogic What is MarkLogic? MarkLogic in Financial Services
E N D
The Only Operational Database Technology for Mission-Critical Big Data Applications Paul Preuveneers – Principal Technologist Lee Pollington – Principal Consultant
Agenda • Big Data and MarkLogic • What is MarkLogic? • MarkLogic in Financial Services • MarkLogic Integration Points (Connectors / Toolkits)
Volume Velocity Petabyte / Exabyte Billions of items Social Media Machine data Data processes producing data 10Ks of transactions per second In & out Streams Bulk processing Patterns Inference Unstructured Disparate events Relationships Varied sources Varied data types Changing data types Value from decision support Value from operational efficiencies Complexity Variety Value Variability
Agenda • Big Data and MarkLogic? • What is MarkLogic? • MarkLogic in Financial Services • MarkLogic Integration Points (Connectors / Toolkits)
What is MarkLogic Server? • Special Purpose DBMS for poly-structured information, with enterprise expectations • ACID transactions • Backup, Full/Partial Replication, Distributed Txns • Search Engine Kernel, with enterprise expectations • Full text • Faceted navigation, at massive scale • Boolean, proximity, stemming, tokenization, decompounding, case, diacritics, language… • Application Server • HTTP (including RESTful) • XCC Java/.NET • WebDAV
What makes MarkLogic DBMS Special? • Not Relational (RDBMS) • XML • The Only Data Model Required • Schema Agnostic • Text a First-class Citizen among Data Types • XQuery/XSLT • Optimized Search Engine Algorithms • Very Low DBA Overhead (0.5 FTE / 100 hosts) • 5-Minute Install • 5-Minute Scale-Out • Database and Search Engine are the same
What makes MarkLogic Search Special? • Transactional: Enterprise Scale (no index latency) • Unicode (Internationalization) • Multiple Query Types • Analytics: Aggregation, Facets & Ranges, Co-occurrence, Geospatial • Text Search: Boolean, Stemming, Word Lexicons, Dictionary & Thesauri • Alerting: Profiles, Alerts, Filters, Tipping, Selectors, “Triggers” … • Powerful Search Combination (e.g. Text + Analytics + Alerting) • Processing Near the Data (fast search, low bandwidth) • Database and Search Engine are the same
Search: Universal Index Range Indexes Term Term List “accelerating” 123, 127, 129, 152, 344, 791 . . . “creation” 122, 125, 126, 129, 130, 167 . . . “content” 123, 126, 130, 142, 143, 167 . . . “application” 123, 130, 131, 135, 162, 177 . . . Document References “agility” 126, 130, 167, 212, 219, 377 . . . <article> . . . <article> / <title> . . . 126, 130, 167, … product: MarkLogic Geospatial
MarkLogicCan Scale • Scale Up: Typically 1TB+ XML per Server • Scale Out: Low Hundreds(++) of Servers in a Cluster • Commodity Hardware • 2-CPU x 6-core/hyperthreaded • 32+ GB RAM • 3x disk: local mount with failover • OS • Linux RHEL 5 • Solaris 10 • Windows 2003/8 (XP/Vista/7 for Dev)
Shared-Nothing Cluster E Host 1 E Host 2 E Host 3 AppServer Same Code- base Data D Host 4 D Host 5 D Host 6 D Host k HA&DR partition1 partition2 partition3 partition4 partitionm
Agenda • Big Data and MarkLogic • What is MarkLogic? • MarkLogic in Financial Services • MarkLogic Integration Points (Connectors / Toolkits)
Operational Data Store / Trade Store Example: JP Morgan Chase ODS • Live for 12+ months • 2.25 million OTC Derivatives (450+ million documents) • Strategic platform mandated for core transaction processing • Short-listed for Best Investment Banking Initiative at The Banking Technology Awards 2011 • Agile onboarding of new Derivatives products • Huge reduction in time to process FO XML messages • 20 Sybase systems replaced with 3-Node MarkLogic cluster
It's a Trade Processing Story • Started with Derivatives • Natural fit with documents • Complex instruments, “low volume” instruments • It’s a trade workflow engine • Enterprise Service Bus / Component architecture • New products • Modifications to existing products • Securities had a new challenge for us
Document Analysis (e.g. Sales Process, Financial Directives)