430 likes | 670 Views
Manage your Data Growth and Deliver Real-Time Information Insight. Oracle Engineered Systems for Geosciences. Eric Bezille CTO Oracle Systems - https://blogs.oracle.com/EricBezille. Safe Harbor Statement.
E N D
Manage your Data Growth and Deliver Real-Time Information Insight Oracle Engineered Systems for Geosciences Eric BezilleCTO Oracle Systems - https://blogs.oracle.com/EricBezille
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
TypicalGeoscience Architecture… …and opportunities Citizens • Mapping more data points to providebetterinformeddecisions : MapViewerServer Bringing local knowledge to geo-information infrastructures • In (near) Real-Time Open Data, …. ClientWeb Spatial Database • Discoveringweaksignals • with ALL data AnalyticServer Discovery UnstructuredData Sources: Oracle Spatial Users Conference 2012, Journal of the American Water Resources Association, Australian Government – Geoscience Australia, Geological Survey of India,… Government Ministeries
GeoscienceOpportunities… > … and IT requirements • Mapping more data points to provide better informed decisions • In (near) Real-Time • Discovering weak signals with ALL data • Manage Storage Growth& Performances • Manage ProcessingConstrains • Provide new tools(& skills)
> US Census BureauOracle Spatial User Conference 2012 • Complex Spatial Database, quite large, mission critical • Growing at 10-15% annually • Demands from user community for spatial and temporal accuracy and quality • Stringent processing deadlines remain, so GEO is processing more data in shorter time • Oracle database on >100 nodes, scores of applications • Ability to handle larger loads on systems (Data Visualization,…) • Consolidation of Databases & Servers • • Virtualization • • Elasticity, Agility • • Service Oriented Architecture • • Reduction in storage Source: Experiences with Exadata and Oracle Spatial at US Census Geography Division, May 2012 - Oracle Spatial User Conference
Systems Architecture Matters • Standard Materials • Different Results
Simplification + Control = Agility Engineered Systems Other Vendors Applications & Middleware Integrated Tuned Optimized Identical OS Kernel Patch Level? OS Layer RAM Flash Compute & Virtualization Layer Rack CPU Blade InfiniBand FC I/O Layer Ethernet SSD NAS 1 Engineered System Unpack to production in hours HDD? Storage Layer SAN 160+ Separate Parts Months from start to production
Engineered Systems Design Principles • ScalableComputeGrid Architecture • Scalable Storage Grid Architecture • No-bottleneck : Infinibandbackbone • In-Memory Hierarchy • Oracle SW optimized for Oracle HW Example: ¼ rack Exadata X2-2
Oracle SW optimized for Engineered SystemsKey Exadata Innovation
Seamless Scalability & Future Upgrades Start Small. No Limits. Easy Upgrade. Existing Data Multi-Rack Full Rack Half Rack
Engineered Systems Exalytics Exadata DatabaseMachine Exalogic Elastic Cloud Big Data Appliance SPARC SuperCluster • Reduced change management risk • Better reliability and one-stop support • Extreme performance • Expedited time to value • Easier to manage and upgrade • Lower cost of ownership
Engineered SystemsHardware + Software engineered to work together Exalytics Exadata DatabaseMachine Exalogic Elastic Cloud Big Data Appliance SPARC SuperCluster • Reduced change management risk • Better reliability and one-stop support • Extreme performance • Expedited time to value • Easier to manage and upgrade • Lower cost of ownership
Engineered Systems in GeoscienceLandscape Citizens Exadata /SuperCluster MapViewerServer ClientWeb Spatial Database AnalyticServer Big Data Appliance Discovery UnstructuredData Exalytics Government Ministeries
Validation Of Home Appraisals • Validate home appraisals for a Government Sponsored Enterprise • Requirement: Find all the parcels touching parcels to validate appraisals • Processed 2,018,429 parcels • Exadata X2-2 Half-rack: • Serially – 38.25 minutes • Parallel - 48 cores (45x faster) - 50 seconds • Exadata X2-2 Full RAC (96 cores) about 90x faster • Exadata X2-8 (128 cores) even faster • Exadata X2-2 Results
Customer Experience Results – In production for Spatial dataset • Santos Oil and Gas
Customer Experience • US Census Bureau • Out of the box solution helped advance schedule by months • One vendor, facilitated one comprehensive solution • Larger queue sizes for batch jobs without cache fusion • Reduction in overall calendar time for projects • DSF Refresh: 98% completed in 6 days versus 3 weeks • Benchmarking progressing at <50% legacy time • Oracle Spatial on Exadata Source: Experiences with Exadata and Oracle Spatial at US Census Geography Division, May 2012 - Oracle Spatial User Conference
Big Data WeakSignalsDetectionTextAnalysisExampleResults • Document TextAnalysis • Addressing up the 400M$ cost of Unstructured Document Management for big institutions • Result of using Oracle Big Data Appliance with Synthesis Software vs. 300 employees to manage Unstructured Document for 4 to 5 Millions documents / years
New Tools for Information Discovery Rapid, intuitive exploration and analysis of datafrom any combination of structured and unstructured sources • Benefits • Unprecedented Information Visibility • Leverage Existing BI Investments • Self-Service Data Discovery • Reduced IT Costs, Better Business Decisions • Unique Features • Contextual Search, Navigation, Analytics • Dynamic Data and Metadata • Content Acquisition and Text Enrichment • In-Memory Performance Unstructured Analytics VISUALIZATION Oracle Exalytics & Endeca
Building your big data architecture Gradually Extending your Existing Architecture for Big Data: • Step 1: Further Analyze Current Data • Step 2: Architect for Data Variety and Volume • Step 3: Discover New Patterns • Step 4: Architect for Data Velocity Agility & Value Increase
Step 0: Data Foundation • Dashboard • Ad-Hoc Query OracleDatabase Oracle BIEnterpriseEdition High Density Data Organize Decide Analyze Acquire
Step 1: Deep Analysis of Current Data • Dashboard • Ad-Hoc Query OracleDatabase • Trends • Locality Oracle BIEnterpriseEdition Spatial and Graph AdvancedAnalytics High Density Data Organize Decide Analyze Acquire
Step 2: Architect for Volume and Variety • Dashboard • Ad-Hoc Query OracleDatabase • Trends • Locality Oracle BIEnterpriseEdition • Relationship • Sensors Spatial and Graph Hadoop Data Storm AdvancedAnalytics AggregatePre-Analyze Low Densityvariable Data High Density Data Organize Decide Analyze Acquire
Step 3: Discover New Information Endeca Information Discovery • Dashboard • Ad-Hoc Query OracleDatabase Oracle BIEnterpriseEdition • Trends • Locality • Relationship • Sensors Spatial and Graph Hadoop Data Storm AdvancedAnalytics AggregatePre-Analyze • Discover Low Densityvariable Data High Density Data Organize Decide Analyze Acquire
Step 4: Architect for Velocity Event Processing Real TimeDecisions Endeca Information Discovery OracleDatabase • Dashboard • Ad-Hoc Query Oracle BIEnterpriseEdition • Trends • Locality Hadoop Spatial and Graph • Relationship • Sensors AggregatePre-Analyze AdvancedAnalytics Model • Discover StreamingData Data Storm • Recommend • Act High Density Data Low DensityBatch Data Act Organize Decide Analyze Acquire
In-House AND Cloud Internal GIS (structured data) Google Map API Open Data InternalScientific Reports, documents, … (unstructured data) BOTH CyberGIS (yours ?) WeakSignalsAnalysisplatform (recurrent) WeakSignalsAnalysis (once)
Oracle Engineered Systems Simplify IT – Simplify Big Data • Dashboard • Ad-Hoc Query Oracle Big Data Appliance • Trends • Locality Oracle Exalytics Oracle Exadata & SuperCluster • Relationship • Sensors InfiniBand InfiniBand • Discover Oracle RTD • Recommend • Act Organize Decide Analyze Acquire
Parallel Query And Spatial Operators • On Exadata Half RAC: • 34.75 hours serially vs. 41.1 minutes in parallel • 48 database cores - 47x faster • On Exadata X2-2 Full Rack • 96 database cores – about 94x faster • 2-8 (128 cores) even faster • Exadata X2-2 Results
Parallel Pipelined Table Function Exadata ¼ RAC X2-2 (24 Cores) • Batch geocoding – 1365/second • Batch reverse geocoding – 3388/second • Batch DEM get_cell_value raster lookups – 8951/seconds • Exadata X2-2 Results
Exadata Hybrid Columnar Compression and Spatial • Point data
Exadata Hybrid Columnar Compression and Spatial • Line / Polygon data
Discovery Application LifecycleBuilding applications in days, not months Diverse and changing information integrated and enriched via ETL Automatically unified in Oracle Endeca Server – no predefined model required Interactive search, navigation and visualization for exploration and analysis Drag-and-drop application composition in Studio Structured Semi-Structured Unstructured Iterate