1 / 40

IO Memory Management Hardware Goes Mainstream

IO Memory Management Hardware Goes Mainstream. Mark Hummel AMD Fellow Computation Products Group, AMD. Mark.Hummel @ amd.com. Session Overview. Benefits and Function Topology and Features Translation Data Structures Software Interface. Function Of An IOMMU What does it do?.

diane
Download Presentation

IO Memory Management Hardware Goes Mainstream

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IO Memory Management Hardware Goes Mainstream Mark HummelAMD FellowComputation Products Group, AMD Mark.Hummel @ amd.com

  2. Session Overview • Benefits and Function • Topology and Features • Translation Data Structures • Software Interface

  3. Function Of An IOMMUWhat does it do? • Translates requests that come from all devices regardless of target • Enforces access rights of devices to system address space • Page granular protection • Separate read and write access rights • Maintains cache of translations • Root of distributed caching hierarchy of address translations

  4. Function Of An IOMMUWhat does it not do? • Does not translate CPU originated traffic • The processor’s traffic is translated by the CPU’s MMU • Does not directly support demand paged IO • Devices and drivers are not designed to deal with arbitrary delays • Devices and drivers don’t understand concept of an “IO Page Fault” • Support for remote address translation extensions enable indirect device specific demand paging

  5. System TopologyWhere is the IOMMU? HT Device DRAM ATC Tunnel ATC PCIe bridge optional remote ATC HT CPU IOMMU PCIe bridge PCI Expressdevices,switches PCIe bridge ATC HT CPU IOMMU IO Hub DRAM ATC = Address Translation Cache HT = HyperTransport PCIe = PCI Express PCI, LPC, etc

  6. System Topology • IOMMU are at edge of system interconnection fabric • Full Source Identification is available • IOMMU are distributed and independent • Creates scalable caching structures • IOMMU supports remote address translation caching extensions • Allow tuning of caching hierarchy

  7. Benefits Of An IOMMU Why have one? • Enhanced virtualization capabilities • Direct device assignment to Guest OS • Improved performance and scalability • Enables direct device access by user mode applications

  8. Guest OS Guest OS Virtual Device Driver Virtual Device Driver Benefits Of An IOMMU Direct Device Assignment Example Guest OS Guest OS Device Driver Device Driver Overhead reduced in path between Guest and Device Virtual Device Emulator Virtual Machine Monitor Device Driver Device Driver IOMMU Virtual Machine Monitor Device Controller Device Controller Device Controller Device Controller

  9. Benefits Of An IOMMU Why have one? (continued) • Enhanced security capabilities • Adds precise device access control of address space • Creates IO protection domains • Enhanced system reliability • Isolation between devices • Protects system memory from errant device writes

  10. Benefits Of An IOMMU Security and Isolation Example System Memory System Memory Protection Domain 1 Protection Domain 2 I/O Buffer I/O Buffer I/O Buffer I/O Buffer IOMMU Malicious or Errant Write Write is blocked Device Controller Device Controller Device Controller Device Controller

  11. Benefits Of An IOMMU Why have one? • Support for Trusted Input and Output • Creates a protected channel between a device and driver

  12. Benefits Of An IOMMU Trusted I/O Example Protected Channels Application Application Device Driver Device Driver Device Driver Device Driver System Memory System Memory I/O Buffer I/O Buffer I/O Buffer I/O Buffer Content is capture by 3rd party 3rd party access is blocked IOMMU Disk Controller Graphics Controller Disk Controller Disk Controller Graphics Controller Disk Controller

  13. Benefits Of An IOMMU Why have one? • Support legacy 32-bit devices in large-memory systems • Eliminates bounce buffers

  14. Benefits Of An IOMMU Bounce Buffer Example IOMMU translates address so data can be directly placed CPU must move data CPU CPU I/O Buffer I/O Buffer 4 GB+ 4 GB+ 0 - 4 GB 0 - 4 GB Bounce Buffer System Memory System Memory Controller limited to 32 bit addressing IOMMU Disk Controller Disk Controller

  15. Benefits Of An IOMMU Why have one? • Synergy with PCI-SIG virtualization efforts • Address translation service (ATS) • Single root device virtualization • Multi-root shared I/O fabric

  16. IOMMU Features • Variable per-device virtual address range • Variable per-device physical page size • Flexible virtual address space sharing options • Devices can have their own virtual address space • Devices can share a virtual address space • Can be utilized natively by an enhancedOS • Can be utilized by a virtual machine monitor

  17. Translation Data StructuresDefinitions • Requester ID (RID) • Label identifying the source of a transaction • Address Translation Cache (ATC) • Local or remote coherent copy of address translations • I/O Translation Look aside Buffer (IOTLB) • A remote ATC that exists in a device associated with an IOMMU • Address Translation Services (ATS) • Extensions supporting remote caching of address translations

  18. Translation Data StructuresDefinitions • Page Directory Entry (PDE) • Translation table entry that points at a table • Page Table Entry (PTE) • Translation table entry that contains a translation • Root translation table • Translation table at the top of translation hierarchy • Device Table • Maps Requester ID to root translation table

  19. Translation Data StructuresDevice Requests • Contain a Requester ID • BUS/Device/Function (BDF) used for PCI Express • Unit ID or BDF (with SRC ID extension) for HyperTransport • Extensions to support remote ATC • Un-translated (device virtual address) read or write • Default case • IOMMU will translate the address of the request • Translated (system physical address) read or write • IOMMU uses the address provided without translation • Translation request

  20. Translation Data StructuresDevice Table • Single contiguous block of system memory • Maps Requester ID to a root translation table • Per device virtual address space supported • Many to one mappings supported • Each device is assigned a Domain ID • Devices may share a Domain ID • IOMMU Invalidations managed on a per Domain basis

  21. 127 104 103 96 Control Bits Reserved 63 60 52 51 32 62 61 64 95 80 79 Res IR Page Table Root Pointer [51:32] Reserved IW Domain ID [15:0] Reserved 31 12 11 9 8 0 NL Reserved Page Table Root Pointer [31:12] V Translation Data StructuresDevice Table V – valid bit IW – I/O Write protection IR – I/O Read protection NL – next Level Res - reserved

  22. Translation Data StructuresPage tables • Translation tables are always 4K byte blocks in system memory • Root translation table base address comes from the Device Table • May point to either a table of PDE or PTE • Intermediate translation tables • Point to either a table of PDE or PTE

  23. 1 31 12 11 9 8 0 Next Table Addr [31:12] P NL Reserved 63 62 61 60 59 58 57 52 51 32 S Res Res NS U Page Address [51:32] IW IR 1 31 12 11 9 8 0 Page Address [31:12] P 000 Reserved Translation Data StructuresPage tables 63 62 61 60 52 51 32 Res IR Next Table Addr [51:32] Res IW PDE Format PTE Format IW – I/O Write protection IR – I/O Read protection NL – next Level P – present S – size U – ATS attribute bit NS – ATS attribute bit Res - reserved

  24. Pointer Index Translation Data Structures Simplified View 1) IOMMU receives request, but the translation is not cached in the ATC. So 2) The Requester ID from the device request is used to select the root translation table Device Table Base RequesterID Device Table Page tables Device request 3) Address from the device request is used to walk page tables IOMMU “Virtual” address ATC “Translated” address 4) and refill the ATC and satisfy the device request

  25. Translation Data StructuresAdvance capabilities • Support for 64-bit device virtual address • Requires 6 level lookup • Support for variable page sizes • All power of 2 from 4K up • IOMMU tables may be shared with CPU MMU • Can be efficiently virtualized

  26. Translation Data StructuresAdvance capabilities • Configurable maximum table depth • If virtual address has group of leading zeros the lookup depth may be reduced • Level Skipping • If virtual address has interior groups of zeros, lookup levels may be skipped • Early Exit • Exit is possible at any level with remaining un-translated address bits used as an offset within a “super page” • Base is 4k, super pages are at 2M(221) 1G(230), 512G(239), 248

  27. Translation Data Structures Example with level skipping 63 58 20 0 57 48 47 39 38 30 29 21 Level-4 Page Table Offset Level-2 Page Table Offset 0000000b1 000000000b1 000000000b1 Physical Page Offset Final Level 1 Skipped 2M Super page Level-4 Table Level-2 Table 2 MB Page Levels Skipped 21 9 9 52 52 Physical Address PTE 0h PDE 2h 63 52 51 12 11 9 8 0 Starting Level Level 4 Page Table Address 4h 1The Virtual Address bits associates with all skipped levels must be zero

  28. Software InterfaceControl Structures • Command Queue • Event Queue IOMMU Cmd Buffer base register Device Table base register Event Log base register Device Table Command Queue Event Log I/O Page Tables

  29. Software InterfaceControl Structures • Command Queue • Circular ring buffer in system memory • Low insertion overhead • Processed at IOMMU service rate • 16 byte command entries • Maximum size is 512 KB • Event Log • Circular ring buffer in system memory • Low removal overhead • Processed at CPU service rate • 16 byte log entries • Maximum size 512 KB

  30. Software Interface Command queue • Tail Pointer is incremented by the CPU after writing a command • Tail Pointer write signals IOMMU that new command is ready • Head Pointer is incremented by the IOMMU after reading a command System Software (producer) Circular command buffer in system memory IOMMU (consumer) +112 +96 +80 writes IOMMU registers +64 MMIO Offset 2008h tail pointer +48 tail pointer buffer base +32 MMIO Offset 0008h buffer base reads buffer size +16 buffer size head pointer MMIO Offset 2000h +0 status register MMIO Offset 2020h

  31. Software Interface Commands • Invalidate Device Table Entry • Indexed by Device ID • Invalidate IOMMU Pages • Power of 2 naturally aligned number of 4K pages • Indexed by Domain • Invalidate IOTLB Pages • Power of 2 naturally aligned number of 4K pages • Indexed by Device ID • Completion Wait • May be used as a fence • May be used to signal an interrupt • May be used to write a flag in system memory

  32. Software Interface Command ordering and semantics • IOMMU manages ordering interlocks • Invalidate Device Table commands will complete before subsequent Invalidate IOMMU Pages commands • Invalidate IOMMU Pages commands will complete before subsequent Invalidate IOTLB Pages commands • Completion semantics • Invalidation commands are complete when all overlapping DMA transactions that are in flight to system memory are either complete or visible • Completion signaled when Completion Wait command is executed • Interrupt • Memory based flag

  33. IOMMU registers tail pointer buffer base buffer size head pointer status register Software Interface Event log • Tail Pointer is incremented by the IOMMU after writing an event • IOMMU can be configured to signal an interrupt when event log is written • Head Pointer is incremented by the CPU after reading an event • Head Pointer write signals IOMMU that event has been consumed IOMMU (producer) System Software (consumer) Circular event log in system memory +112 +96 writes +80 [MMIO Offset 2018h] +64 head pointer +48 [MMIO Offset 0010h] buffer base reads +32 buffer size [MMIO Offset 2010h] +16 +0

  34. Software Interface Events • Translation events • Invalid Device Table Entry • IO Page Fault • Device Table HW Error • Page Table HW Error • Invalid Device Request • Command processing events • Command HW Error • Illegal Command • IOTLB Invalidate Timeout

  35. Software Interface Exception Handling • Translation failure for any reason (i.e. Errors due to I/O page faults, memory errors due to page table walks) • Request is aborted • Completer Abort (CA) returned to device where possible • Details logged • Interrupt is optionally generated • Command queue failure • Processing is halted • Details logged • Interrupt is optionally generated

  36. Software Interface OS/Hypervisor Interactions • Initialization • Done via configuration and MMIO transactions • Clear caches, set base address and size of domain tables, etc • Runtime operations • Device table updates, translation cache invalidations • Combination of MMIO and DRAM accesses • MP support requires software-managed sharing of command buffer • Each IOMMU has separate command and event queue • Virtualization of IOMMU • Intercept MMIO pointer writes to virtual IOMMU • Process virtual IOMMU command queue and update shadow tables • Forward Invalidate commands to real IOMMU

  37. Call To Action • Read the “AMD I/O Virtualization (IOMMU) Technology” specification to understand hardware assisted virtualization, available at http://developer.amd.com/documentation.aspx • Driver writers should consider the effects of the change from physical to virtual address assignment • Device vendors should consider the impact on their devices when used with I/O memory management hardware • Sign up for AMD’s development center at http://devcenter.amd.com

  38. Additional Resources • Web Resources • Main Page http://www.amd.com • Developer Center http://devcenter.amd.com • PCI-SIG http://www.pcisig.com • Related Sessions • PCIe Address Translation Services and I/O Virtualization • Windows Virtualization Best Practices and Future Hardware Directions

  39. Questions?

More Related