1 / 22

A look at memory issues

A look at memory issues. Data-transfers must occur between system memory and the network interface controller. Typical Chipset Layout. CPU Central Processing Unit. MCH Memory Controller Hub (Northbridge). DRAM Dynamic Random Access Memory. Graphics Controller. NIC Network Interface

nairi
Download Presentation

A look at memory issues

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A look at memory issues Data-transfers must occur between system memory and the network interface controller

  2. Typical Chipset Layout CPU Central Processing Unit MCH Memory Controller Hub (Northbridge) DRAM Dynamic Random Access Memory Graphics Controller NIC Network Interface Controller AC Audio Controller ICH I/O Controller Hub (Southbridge) Multimedia Controller HDC Hard Disk Controller USB controller Firmware Hub Timer Keyboard Mouse Clock

  3. Typical Chipset Layout CPU Central Processing Unit MCH Memory Controller Hub (Northbridge) DRAM Dynamic Random Access Memory Graphics Controller DMA NIC Network Interface Controller AC Audio Controller ICH I/O Controller Hub (Southbridge) Multimedia Controller HDC Hard Disk Controller USB controller Firmware Hub Timer Keyboard Mouse Clock

  4. PCI Bus Master DMA 82573L i/o-memory Host’s Dynamic Random Access Memory on-chip RX descriptors packet-buffer on-chip TX descriptors packet-buffer Descriptor Queue packet-buffer DMA packet-buffer RX and TX FIFOs (32-KB total) packet-buffer packet-buffer packet-buffer

  5. Memory-mapped I/O • We mentioned that Intel’s x86 architecture originally was designed with two separate address-spaces, one for memory and the other for I/O ports, unlike the designs for CPUs by many of Intel’s competitors in which I/O access was “memory-mapped” • But now the newer Intel processors also can support memory-mapped I/O as well

  6. Address-bus widths • The Intel Core-2 Quad processors in our classroom and Lab machines potentially could address 236 physical memory cells (i.e., 64GB), although only 4GB of RAM actually are installed at the present time • Some PCI-compliant hardware devices were designed for a 32-bit address-bus, thus they must be “mapped” below 4G

  7. Physical-address assignments Devices’ registers must be mapped to addresses in the bottom 4G Dynamic Random Access Memory The CPU’s physical address-space

  8. Virtual addresses • Software running on the x86 processor is unable to use actual memory addresses, but instead uses ‘virtual’ addresses that map to physical addresses by means of mapping-tables which Linux dynamically defines for each different process it runs • This complicates the steps software must take to arrange for the DMA to take place

  9. Our ‘dram.c’ module • To help us confirm that our hardware-level network software is working as we intend, or to diagnose our ‘bugs’ if it isn’t, we can use an LKM we’ve written that implements a character-mode device-driver for system memory, allowing us to view the contents of physical memory as if it were a file; for example, by using our ‘fileview.cpp’ tool

  10. How to view system memory • Download ‘dram.c’ from course website • Compile it using our ‘mmake’ utility • Install ‘dram.ko’ by using ‘/sbin/insmod • Insure the ‘/dev/dram’ device-node exists • Download ‘fileview.cpp’ from website and compile it with ‘g++’ (or with ‘make’) • Execute ‘fileview /dev/dram’ and use the arrow-keys to navigate (or hit <ENTER>)

  11. “canonical” addresses 00000 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 01011 01100 01101 01110 01111 10000 10001 10010 10011 10100 10101 10110 10111 11000 11001 11010 11011 11100 11101 11110 11111 0xFFFFFFFFFFFFFFFF … 0xFFFF800000000000 “non-canonical” (invalid) virtual addresses “canonical” addresses Analogy using 5-bit values 64-bit “vrtual” address space 0x00007FFFFFFFFFFF … 0x0000000000000000 “canonical” addresses

  12. 4-Levels of mapping 63 48 47 39 38 30 29 21 20 12 11 0 sign-extension PML4 PDPT PDIR PTBL offset Page Frame (4KB) 64-bit ‘canonical’ virtual address Page Table Page Directory Page Directory Pointer Table Page Map Level-4 Table CR3 Each mapping-table contains up to 512 quadword-size entries

  13. 4-level address-translation • The CPU examines any virtual address it encounters, subdividing it into five fields 63 48 47 39 38 30 29 21 20 12 11 0 sign- extension index into level 4 page-map table index into page- directory pointer table index into page- directory index into page-table offset into page-frame 16-bits 9-bits 9-bits 9-bits 9-bits 12-bits Any 48-bit virtual-address is sign-extended to a 64-bit “canonical” address Only “canonical” 64-bit virtual-addresses are legal in 64-bit mode

  14. Format of 64-bit table-entries 63 62 52 51 40 39 32 E X B avl Reserved (must be 0) Page-frame physical base-address [39..32] 31 12 11 9 8 7 6 5 4 3 2 1 0 Page-frame physical base-address[31..12] avl A P C D P W T U W P Meaning of these bits varies with the table Legend: P = Present (1=yes, 0=no) PWT = Page Cache Disable (1=yes, 0=no) W = Writable (1=yes, 0=no) PWT = Page Write-Through (1=yes, 0=no) U = User-page (1=yes, 0=no) avl = available for user-defined purposes A = Accessed (1=yes, 0=no) EXB = Execution-disabled Bit (if EFER.NXE=1)

  15. Our ‘mem64.c’ module • We wrote an LKM to create a pseudo-file that will let us see how the virtual memory is being utilized by an application program • Download this file, compile it with ‘mmake’ and install ‘mem64.ko’ in the Linux kernel • Then view the virtual-memory mapping that is being used by the ‘cat’ program: $ cat /proc/mem64

  16. The NIC’s PCI ‘resources’ 16 doublewords 31 0 31 0 Dwords Status Register Command Register DeviceID 0x109A VendorID 0x8086 1 - 0 BIST Header Type Latency Timer Cache Line Size Class Code Class/SubClass/ProgIF Revision ID 3 - 2 Base Address 1 Base Address 0 5 - 4 Base Address 3 Base Address 2 7 - 6 Base Address 5 Base Address 4 9 - 8 Subsystem Device ID Subsystem Vendor ID CardBus CIS Pointer 11 - 10 reserved capabilities pointer Expansion ROM Base Address 13 - 12 Maximum Latency Minimum Grant Interrupt Pin Interrupt Line reserved 15 - 14

  17. Mechanisms compared Each NIC register has its own address in memory (allows one-step access) kernel memory-space NIC i/o-memory io user memory-space Access to all of the NIC’s registers is muliplexed through a pair of I/O-ports (requires multiple instructions) addr data CPU’s ‘virtual’ address-space CPU’s ‘I/O’ address-space

  18. ‘nicstatus.c’ • Here’s an LKM that creates a pseudo-file (called ‘/proc/nicstatus’) which will allow a user to view the current value in our Intel 82573L Network Interface Controller’s ‘DEVICE_STATUS’ register • It uses the I/O-port interface to the NIC’s registers, rather than a ‘memory-mapped’ interface to those device-registers

  19. Device Status (0x0008) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 ? 0 0 0 0 0 0 0 0 0 0 0 GIO Master EN 0 0 0 some undocumented functionality? 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 0 0 0 0 PHY RA ASDV I L O S SPEED S L U 0 TX OFF Function ID 0 0 L U F D FD = Full-Duplex LU = Link Up TXOFF = Transmission Paused SPEED (00=10Mbps,01=100Mbps, 10=1000Mbps, 11=reserved) ASDV = Auto-negotiation Speed Detection Value PHYRA = PHY Reset Asserted 82573L

  20. ‘82573.c’ • This is a more elaborate example of an LKM which not only creates a pseudo-file (i.e., ‘/proc/82573’) that we can view using the Linux ‘cat’ command and that lets us see our NIC’s PCI Configuration Space, but also implements some device-driver functions that let us view the NIC’s device registers by using our ‘fileview.cpp’ tool

  21. Linux PCI helper-functions #include <linux/pci.h> struct pci_dev *devp; unsigned int mmio_base; unsigned int mmio_size; void *io; devp = pci_get_device( VENDOR_ID, DEVICE_ID, NULL ); if ( devp == NULL ) return –ENODEV; mmio_base = pci_resource_start( devp, 0 ); mmio_size = pci_resource_len( devp, 0 ); io = ioremap_nocache( mmio_base, iomm_size ); if ( io == NULL ) return –ENOSPC;

  22. In-class exercise • Two of the NIC’s 32-bit device registers are used to hold its 48-bit Ethernet MAC address – which will be a different value for each of the hosts in our classroom • These two registers are located at offsets 0x5400 and 0x5404 in device-memory • The six bytes occur in network byte-order • Write code to show the MAC address!

More Related