360 likes | 553 Views
NoSQL and Oracle RDBMS – A perfect fit. Chao Huang Senior Manager, Oracle NoSQL Database development. What is Oracle NoSQL DB. Elegant Simplicity. A Cloud-scale Distributed Key-Value store with Multi-TBs to PBs capacity Transactional guarantees High availability Elastic Scaling
E N D
NoSQL and Oracle RDBMS – A perfect fit Chao Huang Senior Manager, Oracle NoSQLDatabase development
What is Oracle NoSQL DB Elegant Simplicity • A Cloud-scale Distributed Key-Value store with • Multi-TBs to PBs capacity • Transactional guarantees • High availability • Elastic Scaling • Dynamic partitioning • Predictable performance • Simple administration
Oracle NoSQL DB Building Blocks Berkeley DB Java Edition • Ideal Storage for Key-Value pairs • ACID transactions • High availability • High throughput • Simple administration • Already proven in • Amazon Dynamo • Voldemort (Linkedin) • GenieDB
Oracle NoSQL DB Building Blocks What’s added to BDB JE • Dynamic Partitioning (aka Sharding) • Intelligent routing of requests • Load balancing • Bounded latency • Multi-Node backup • Monitoring
Where Can You Use it ? • Compatibility • Supported OS – Linux and Solaris 10 • Oracle NoSQL DB is written in Java • Java and C APIs • Bindings available for Jython, Jruby, Clojure, Groovy, Rhino
What Versions Are Available ? • Oracle NoSQL DB Community Edition (AGPL) • Oracle NoSQL Enterprise Edition • Extra features for enterprise use (External Tables, RDF/SPARQL, OEP, Coherence, SNMP) • Enterprise Support
Performance – Reads + Writes • YCSB • 1.25M ops/sec • 2 billion records • 2 TB of data • 95% read, 5% update • Low latency • High Scalability
Performance - Inserts • YCSB • 226K ops /sec • 2 billion records • Low latency • Highly Scalable
NoSQL for Fraud Scoring Financial Services coordinated theft prevention Benefits Objectives • Simple data model, flexible transactions • Scalable, Low Latency data management • Easy configuration and administration • Enterprise Support • Combine data sources for complex scoring • Detect, alert analyst with low latency • Handle burst seasonal transaction volumes Solution Application Data Ingestion • Oracle Coherence cluster for real time transaction object management • Oracle NoSQL Database for fraud model and customer profile management • Oracle Database for statistics and fraud modeling-related data NoSQL DB Driver Transaction Authorization Processor
NoSQL for Customer Experience Management Brand enhancement and loyalty enrichment Benefits Objectives • Simple, flexible data format • Highly scalable with predictable performance • Enterprise support, technology commitment and roadmap • Centralized view of customer data within federated database environment • Dynamic, customer influence tactics Event Scheduling Application Solution Customer Care & End Customers Staff & End Customers • Oracle NoSQL database for central repository of meta data for customer activity, scheduling and “next generation experience” events • Oracle database for financial data, reservation and property management NoSQL DB Driver Reservation Systems Customer Profiles
NoSQL for Online Advertising Platform for real-time marketing Benefits Objectives • Ease of management and administration • Scalability and predictable performance • Integrated storage and processing technologies • Enterprise support • Effective segmented advertizing platform • Improve revenue by increasing granularity of market segmentation Solution Advertising Svr • Oracle NoSQL database for cookie management and ad content lookup • Oracle database and Hadoop/MapReduce for market segmentation analysis, ad generation and recommendation • Oracle Database for complex analytics End Customers Business Users NoSQL DB Driver Multi-Dimensional Reporting Web Click-stream Content Delivery Acquire, Analyze, Prepare
Flexible Data Model Key-Value Pairs SKU Major key: Strings Minor key: Image Brand Rating Price Color Value: Macy’s $59.00, $21.99 Red,Blue, Green Byte Array
CRUD /Smith/Bob/ - /birthdate /Smith/Bob/ - /phone /Smith/Bob/ - /image /Smith/Bob/ - /userID kvstore.get(myKey); kvstore.multiGet(myKey, null, null); kvstore.multiGetIterator (Direction.Forward, 100, myKey, null, null); kvstore.storeIterator (Direction.UNORDERED, 100, myKey, null, null); KeyRange kr = new KeyRange(“Bob”, true, “Pat”, true) Kvstore.storeIterator (Direction.UNORDERED, 100, myKey, kr, null); • Define major key path • Define minor key path • Create the key • Retrieve the record • Use Avro Bindings to deserialize in the application • Reads
CRUD /Smith/Bob/ - /birthdate /Smith/Bob/ - /phone /Smith/Bob/ - /image /Smith/Bob/ - /userID • Construct the key • Construct the value – Avro format • Use one of the put methods • Writes kvstore.put(myKey, myValue); kvstore.putIfAbsent(myKey, myValue); kvstore.putIfPresent(myKey, myValue); kvstore.putIfVersion(myKey, myValue, version);
CRUD /Smith/Bob/ - /birthdate /Smith/Bob/ - /phone /Smith/Bob/ - /image /Smith/Bob/ - /userID • Construct the key • Use delete method • Delete a specified version • Delete multiple records with same major key • Deletes kvstore.delete(myKey); kvstore.deleteIfVersion(myKey, version); kvstore.multiDelete(myKey);
CRUD Sequence of Operations • All records need to share the same major key • Sequences only support write operations • Sequence is performed in isolation
Write Durability Durability per Operation Check quorum Check quorum
Read Consistency Consistency per Operation
Data Modeling CDR email • time0X • CDR01 • emails • CDR0N • emails
Storage Key Space Hash Fn Partitions • SN – Physical (or virtual machine) with CPU + disk • Each SN serves 1 or more replication nodes • Each replication node is part of a single Shard • Each Shard has 1 master • Terminology Shard 0 Shard N … Rep Node Replica Rep Node Replica Rep Node Master Rep Node Replica Rep Node Master Rep Node Replica SN3 SN2 SN1
Sharding Hash Fn • Provides linear scale-out of write ops/sec • No need to develop sharding logic in Application • Hash function to map a key to a partition • Each partition is routed to a single shard • Auto-Sharding Partitions … Shard 0 Shard N Rep Node Replica Rep Node Replica Rep Node Master Rep Node Master Rep Node Replica Rep Node Replica
Replication Shard 0 • Master-Slave replication • Supports Heterogeneous platform hardware/OS/JVM • Dynamic group membership • Configurable consistency/durability • Linear scale out for read ops/sec • Logical Replication • High Availability Rep Node Master Rep Node Replica Rep Node Replica
Data Center Support • Primary Data Center • Proximity to application(s) accessing data • SN's are preferred during elections • Holds simple majority • New HA feature required: ability to specify which nodes can participate in majority • Secondary Data Center • Holds current copy of data in case PDC dies • Minority of replicas • Assume sufficient B/W between PDC and SDC
Administration • Administration Service • Accessible from both command line (CLI) and web console • Highly Available service • Configure database, start and stop services, monitor performance • Automatic Rebalancing • SNMP and JMX support • Roadmap – will be available via Oracle Enterprise Manager
Oracle NoSQL Extensions External Table Support CSV Output Column1 | Colum2 | ColumnN Column1,| Colum2 | ColumnN Column1,| Colum2 | ColumnN Column1,| Colum2 | ColumnN Column1,| Colum2 | ColumnN Database 3 2 5 4 1 External Table Queries External Table Metadata Data Formatter Layer NoSQL Driver Access Driver Data Dictionary publish Configuration File/table *.dat Files
Oracle NoSQL Extensions Database Integration ODI, OLH External Tables Oracle DB
Oracle NoSQL Extensions Hadoop Integration KV Input Format C or Java App Hadoop Cluster
Oracle NoSQL Extensions Oracle Event Processing
Oracle NoSQL Extensions RDF • Unified content metadata for federated resources • Validate semantic and structural consistency Semantic Metadata Layer • Find related content & relations by navigating connected entities • “Reason” across entities Text Mining & Entity Analytics Social Media Analysis • Analyze social relations • using curated metadata • Blogs, wikis, video • Calendars, IM, voice
Why Oracle NoSQL Database It’s Oracle • A trusted vendor here for the long term • Scalable, Available with Predictable Latency Differentiating Features • Always-On Elastic Processing • Configurable ACID Transactions at scale • Easy to use, Smart Topology configuration • Integration with the Oracle technology stack • High Performance • Tested at scale • Easier to manage • We don’t loose transactions