1 / 32

User-mode I/O in Oracle 10g with ODM and DAFS

Session id: 36777. User-mode I/O in Oracle 10g with ODM and DAFS. Jeff Silberman Systems Architect Network Appliance. Margaret Susairaj Server Technologies Oracle Corp . Agenda. The Transportation Revolution Concepts: RDMA, DAT, DAPL, DAFS

holleb
Download Presentation

User-mode I/O in Oracle 10g with ODM and DAFS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Session id: 36777 User-mode I/O in Oracle 10g with ODM and DAFS Jeff Silberman Systems Architect Network Appliance Margaret Susairaj Server Technologies Oracle Corp

  2. Agenda • The Transportation Revolution • Concepts: RDMA, DAT, DAPL, DAFS • RDMA and Oracle 10g • The DAFS API: User-mode I/O and OS bypass • ODM : The File I/O API for Oracle 10g • Oracle 10g RAC and InfiniBand • Performance • Summary, Q&A

  3. The Transportation Revolution • “dumb” networks vs. reliable data movers • Data copies vs. RDMA • Ethernet vs. InfiniBand • Kernel mode I/O vs. User-mode I/O • Unix I/O vs. ODM

  4. Concepts • Remote Direct Memory Access (RDMA) • Direct Access Transports (DAT) • Direct Access Provider Library (DAPL) • Direct Access File System (DAFS)

  5. RDMA • Memory to memory access over a network • Requires both intelligent transports and intelligent network interface cards (NICs) • Cannot be done over “standard” Gigabit Ethernet • Operations defined with respect to the server • Examples: • FC/VI, GbE/VI, DAPL/IB

  6. Direct Access Transports (DAT) • Both RDMA read and RDMA write operations supported • Multiple concurrent virtual connections • Asynchronous I/O • Direct Data Placement • Kernel Bypass DAT is transport agnostic

  7. Direct Access Provider Library (DAPL) • Standards-based API for DAT • DAT Collaborative: Over 40 companies including both Oracle and IBM • Designed to facilitate higher-level RDMA protocols • Examples: DAFS, Oracle RAC • DAPL “providers” are typically the NIC providers • A portable API for RDMA transports • uDAPL for user-level access • kDAPL for kernel-based access

  8. Direct Access File System (DAFS) • DAFS is a remote file access protocol • DAFS derives heavily from NFSv4 • Target is local data-center file sharing • Ideal cluster file system for RAC • Rich set of Oracle-inspired semantics • Will always perform better than TOE’s • Zero touch, zero data copy

  9. Oracle 10g and RDMA DAFS File Server 10g SGA Buffers Buffers DAFS Engine Oracle Disk Manager Oracle File I/O API RDMA NIC (RNIC) InfiniBand Adapter InfiniBand Adapter Control Direct Data

  10. Oracle 10g and RDMA DAFS File Server 10g SGA Buffers Buffers DAFS Engine Oracle Disk Manager Oracle File I/O API DAFS API DAFS user-level I/O library . . . RDMA NIC (RNIC) InfiniBand Adapter InfiniBand Adapter Control Direct Data

  11. Oracle 10g and RDMA DAFS File Server 10g SGA Buffers Buffers DAFS Engine Oracle Disk Manager Oracle File I/O API DAFS API DAFS user-level I/O library DAT library vector DAT RDMA NIC (RNIC) InfiniBand Adapter InfiniBand Adapter Control Direct Data

  12. Oracle 10g and RDMA DAFS File Server 10g SGA Buffers Buffers DAFS Engine Oracle Disk Manager Oracle File I/O API DAFS API DAFS user-level I/O library DAPL Provider DAT library vector DAT Direct Access Provider Libraries DAPL Provider DAPL Provider . . . RDMA NIC (RNIC) InfiniBand Adapter InfiniBand Adapter Control Direct Data

  13. Oracle 10g and RDMA DAFS File Server 10g SGA Buffers Buffers DAFS Engine Oracle Disk Manager Oracle File I/O API DAFS API DAFS user-level I/O library DAPL Provider DAT library vector DAT Direct Access Provider Libraries DAPL Provider DAPL Provider . . . HCA Driver HCA Driver HCA Driver Transport-specificDevice Drivers RDMA NIC (RNIC) InfiniBand Adapter InfiniBand Adapter Control Direct Data

  14. Oracle 10g and RDMA • Low latency • High Bandwidth • Memory to memory transfer • Minimal CPU intervention • User-mode I/O • Storage I/O requests • Data block transfers for cache fusion • Lock request messages • Parallel Query internode messages

  15. DAFS API: User-Mode I/O • Memory Registration • Asynchronous I/O • Security / Authentication • I/O Fencing • I/O Completion Groups • Multi-path I/O

  16. Kernel File System Raw Device Driver User Library Application (unchanged) Application (unchanged) Application (modified) Buffers Buffers Buffers File System Device Driver File I/O Syscalls Disk I/O Syscalls User Space DAFS Layer DAFS Layer uDAFS Library DA Provider Library DA Provider Library DA Provider Library Adapter Driver Adapter Driver Adapter Driver OS Kernel HBA/HCA HBA/HCA HBA/HCA Application Transparency Performance DAFS Implementation Models

  17. Oracle 10g • Grid-based computing • Easily scale the number of servers • Easily scale the storage • Easily share all resources • Ease of Manageability • Improved Performance Capability • Support for new technologies

  18. Oracle Disk Manager (ODM) • The File I/O API for Oracle • Performance of Raw Disk with the Manageability of Files

  19. Problem Solution No consistent standard I/O interfaces. I/O interfaces vary with each operating system variant. The ODM API semantics are invariant across all OS platforms including Windows No standard asynchronous I/O model for regular files. Asynchronous I/O, if it was provided, relied on special kernel-based device drivers. ODM supports both synchronous and asynchronous I/O for any regular files in an ODM file system No standard for batching I/O requests within a single I/O call. The odm_io() function provides batch I/O capability, which minimizes the number of system calls and kernel traps Excess system resources consumed when each process in an Oracle instance must open each datafile in the instance ODM provides shared file identifiers. A given file-id can be used by any process in the instance, thereby reducing the number of opens, instance wide. Oracle Disk Manager (ODM)

  20. ODM Advanced File Semantics • Open with ‘share’ key • Files not visible until file is initialized • Files cannot be deleted if open references exist

  21. ODM version 2 • Zero data copy • Zero touch of data, from storage to SGA • Memory registration • User-mode I/O : Reduced context switches • NIC provisioning • I/O hints and priorities • Non-shared file ids • Same semantics as with Unix file descriptors • Portability • Advanced semantics are invariant across platforms

  22. Oracle 10g RAC Redundant paths for high availability or load balancing Oracle 10g RAC Servers File Storage InfiniBand Switches Internet Application Servers Data Center

  23. Performance • Thanks to Ariel Cohen from Topspin* Communications • One client / one server • 1.8 GHz Xeon CPU • 133 MHz PCI-X bus • 4x IB HCA (10 Gbs) • Gigabit Ethernet w/ checksum offload support • Jumbo frame size of 9000 • RedHat Linux 7.3 • *Ariel Cohen. “A Performance Analysis of 4X InfiniBand Data Transfer Operations”. Proceedings of the International Parallel and Distributed Processing Symposium – Workshop on Communication Architecture for Clusters, April 2003

  24. Performance

  25. Performance

  26. NFS and RDMA

  27. Evolution and Revolution • Hungry apps and database must look elsewhere for extra CPU power • OS bypass for I/O • High performance transports are here today • InfiniBand offers 10Gbs w/ 10 usec latency • Unix and Windows do not provide user-level I/O • The DAFS API does • Oracle 10g RAC w/ a single pipe • Both RAC/IPC and user-level file I/O over one IB pipe “Please keep your seatbelts fastened … “

  28. 11:00 AM How Oracle Database 10g Revolutionizes Availability and Enables the Grid 3:30 PM Oracle Recovery Manager (RMAN) 10g: Reloaded 5:00 PM Proven Techniques for Maximizing Availability 8:30 AM Oracle Database 10g - RMAN and ATA Storage in Action 11:00 AM Oracle Data Guard: Maximum Data Protection at Minimum Cost 1:00 PM Oracle Database 10g Time Navigation: Human-Error Correction 4:30 PM Data Guard SQL Apply: Back to the Future Next StepsHigh AvailabilitySessions from Oracle Tuesday in Moscone Room 304 Wednesday in Moscone Room 304 For More Info On Oracle HA Go To http://otn.oracle.com/deploy/availability/

  29. Database HA Demos All Four Days In The Oracle Demo Campground Real Application Clusters Data Guard Database Backup & Recovery Flashback Recovery LogMiner, Online Redefinition, and Cross Platform Transportable Tablespaces Next StepsHigh AvailabilitySessions from Oracle Thursday 8:30 AM in Moscone Room 304 Oracle Database 10g Data Warehouse Backup and Recovery: Automatic, Simple, Reliable 8:30 AM in Moscone Room 104 Building RAC Clusters over InfiniBand For More Info On Oracle HA Go To http://otn.oracle.com/deploy/availability/

  30. Reminder – please complete the OracleWorld online session surveyThank you.

  31. Q & Q U E S T I O N S A N S W E R S A

More Related