1 / 26

Prelude to Multiprocessing

Prelude to Multiprocessing. Detecting cpu and system-board capabilities with CPUID and the MP Configuration Table. CPUID. Recent Intel processors provide a ‘cpuid’ instruction (opcode 0x0F, 0xA2) to assist software in detecting a CPU’s capabilities

Download Presentation

Prelude to Multiprocessing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prelude to Multiprocessing Detecting cpu and system-board capabilities with CPUID and the MP Configuration Table

  2. CPUID • Recent Intel processors provide a ‘cpuid’ instruction (opcode 0x0F, 0xA2) to assist software in detecting a CPU’s capabilities • If it’s implemented, this instruction can be executed in any of the processor modes, and at any of its four privilege levels • But this ‘cpuid’ instruction might not be implemented (e.g., 8086, 80286, 80386)

  3. Intel x86 EFLAGS register 31 16 21 0 0 0 0 0 0 0 0 0 0 I D V I P V I F A C V M R F 15 0 0 N T IOPL O F D F I F T F S F Z F 0 A F 0 P F 1 C F Software can ‘toggle’ the ID-bit (bit #21) in the 32-bit EFLAGS register if the processor is capable of executing the ‘cpuid’ instruction

  4. But what if there’s no EFLAGS? • The early Intel processors (8086, 80286) did not implement any 32-bit registers • The FLAGS register was only 16-bits wide • So there was no ID-bit that software could try to ‘toggle’ (to see if ‘cpuid’ existed) • How can software be sure that the 32-bit EFLAGS register exists within the CPU?

  5. Detecting 32-bit processors • There’s a subtle difference in the way the logical shift/rotate instructions work when register CL contains the ‘shift-factor’ • On the 32-bit processors (e.g., 80386+) the value in CL is truncated to 5-bits, but not so on the 16-bit CPUs (8086, 80286) • Software can exploit this distinction, in order to tell if EFLAGS is implemented

  6. Detecting EFLAGS # Here’s a test for the presence of EFLAGS mov $-1, %ax # a nonzero value mov $32, %cl # shift-factor of 32 shl %cl, %ax # do logical shift or %ax, %ax # test result in AX jnz is32bit # EFLAGS present jmp is16bit # EFLAGS absent

  7. Testing for ID-bit ‘toggle’ # Here’s a test for the presence of the CPUID instruction pushfl # copy EFLAGS contents pop %eax # to accumulator register mov %eax, %edx # save a duplicate image btc $21, %eax # toggle the ID-bit (bit 21) push %eax # copy revised contents popfl # back into EFLAGS pushfl # copy EFLAGS contents pop %eax # back into accumulator xor %edx, %eax # do XOR with prior value bt $21, %eax # did ID-bit get toggled? jc y_cpuid # yes, can execute ‘cpuid’ jmp n_cpuid # else ‘cpuid’ unimplemented

  8. How does CPUID work? • Step 1: load value 0 into register EAX • Step 2: execute ‘cpuid’ instruction • Step 3: Verify ‘GenuineIntel’ character- string in registers (EBX,EDX,ECX) • Step 4: Find maximum CPUID input-value in the EAX register

  9. Version and Features • load 1 into EAX and execute CPUID • Processor model and stepping information is returned in register EAX • 20 19 16 13 12 11 8 7 4 3 0 Extended Family ID Extended Model ID Type Family ID Model Stepping ID

  10. Some Feature Flags in EDX 28 H T T 13 9 3 2 1 0 P G E A P I C P S E D E V M E F P U HTT = HyperThreading Technology (1 = yes, 0 = no) PGE = Page Global Entries (1=yes, 0=no) APIC = Advanced Programmable Interrupt Controller on-chip (1 = yes,0 = no) PSE = Page-Size Extensions (1 = yes, 0 = no) DE = Debugging Extensions (1=yes, 0=no) VME = Virtual-8086 Mode Enhancements (1 = yes, 0 = no) FPU = Floating-Point Unit on-chil (1=yes, 0=no)

  11. Some Feature Flags in ECX 5 V M X VMX = Virtual Machine Extensions (1 = yes, 0 = no)

  12. Multiprocessor Specification • It’s an industry standard, allowing OS software to use multiple processors in a uniform way • OS software searches in three regions of the physical address-space below 1-megabyte for a “paragraph-aligned” data-structure of length 16-bytes called the MP Floating Pointer Structure: • Search in lowest KB of Extended Bios Data Area • Search in topmost KB of conventional 640K RAM • Search in the 128KB ROM-BIOS (0xE0000-0xFFFFF)

  13. MP Floating Pointer Structure • This structure may contain an ID-number for one a small number of standard SMP system architectures, or may contain the memory address for a more extensive MPConfiguration Table having entries that specify a “customized” system architecture • The machines in our classroom employ the latter of these two options

  14. An example record • The MP Configuration Table will contain a record for each logical processor reserved (=0) reserved (=0) Feature Flags CPU signature (stepping, model, family) CPU Flags BP (bit 1), EN (bit 0) Local-APIC version Local-APIC ID Entry Type 0 BP = Bootstrap Processor (1=yes, 0=no), EN = Enabled (1=yes, 0=no)

  15. Our ‘mpinfo.cpp’ utility • We created a Linux utility that will display the system-information contained in the MP Configuration Table (in hex format) • You can refer to the ‘MP Specification 1.4’ document (online) to interpret this display • This utility needs a device-driver ‘dram.c’ to be pre-installed (in order that it be able to directly access the system’s memory)

  16. A processor’s Local-APIC • The purpose of each processor’s APIC is to allow the CPUs in a multiprocessor system to send messages to one another and to manage the delivery of the interrupt-requests from the various peripheral devices to one (or more) of the CPUs in a dynamically programmable way • Each processor’s Local-APIC has a variety of registers, all ‘memory mapped’ to paragraph-aligned addresses within the 4KB page at physical-address 0xFEE00000

  17. Local-APIC’s register-space APIC 0xFEE00000 4GB physical address-space RAM 0x00000000

  18. Analogies with the PIC • Among the registers in a Local-APIC are these (which had analogues in the older 8259 PIC’s design: • IRR: Interrupt Request Register (256-bits) • ISR: In-Service Register (256-bits) • TMR: Trigger-Mode Register (256-bits) • For each of these, its 256-bits are divided among eight 32-bit register addresses

  19. New way to do ‘EOI’ • Instead of using a special End-Of-Interrupt command-byte, the Local-APIC contains a dedicated ‘write-only’ register (named the EOI Register) which an Interrupt Handler writes to when it is ready to signal an EOI # issuing EOI to the Local-APIC mov $0xFEE00000, %ebx # address of the cpu’s Local-APIC movl $0, %fs:0xB0(%ebx) # write any value into EOI register # Here we assume segment-register FS holds the selector for a segment-descriptor # for a ‘writable’ 4GB-size expand-up data-segment whose base-address equals 0

  20. Each CPU has its own timer! • Four of the Local-APIC registers are used to implement a programmable timer • It can privately deliver a periodic interrupt (or one-shot interrupt) just to its own CPU • 0xFEE00320: Timer Vector register • 0xFEE00380: Initial Count register • 0xFEE00390: Current Count register • 0xFEE003E0: Divider Configuration register

  21. Timer’s Local Vector Table 0xFEE00320 7 0 12 17 16 M O D E M A S K B U S Y Interrupt ID-number MODE: 0=one-shot 1=periodic MASK: 0=unmasked 1=masked BUSY: 0=not busy 1=busy

  22. Timer’s ‘Divide-Configuration’ 0xFEE003E0 3 2 1 0 reserved (=0) 0 Divider-Value field (bits 3, 1, and 0) 000 = divide by 2 001 = divide by 4 010 = divide by 8 011 = divide by 16 100 = divide by 32 101 = divide by 64 110 = divide by 128 111 = divide by 1

  23. Initial and Current Counts 0xFEE00380 Initial Count Register (read/write) 0xFEE00390 Current Count Register (read-only) When the timer is programmed for ‘periodic’ mode, the Current Count is automatically reloaded from the Initial Count register, then counts down with each CPU bus-cycle, generating an interrupt when it reaches zero

  24. Using the timer’s interrupts • Setup your desired Initial Count value • Select your desired Divide Configuration • Setup the APIC-timer’s LVT register with your desired interrupt-ID number and counting mode (‘periodic’ or ‘one-shot’), and clear the LVT register’s ‘Mask’ bit to initiate the automatic countdown operation

  25. In-class exercise #1 • Run the ‘cpuid.cpp’ Linux application (on our course website) to see if the CPUs in our classroom implement HyperThreading (i.e., multiple logical processors in a cpu) • Then run the ‘mpinfo.cpp’ application, to see if the MP Base Configuration Table has entries for more than one processor • If both results hold true, then we can write our own multiprocessing software in H235!

  26. In-class exercise #2 • Run the ‘apictick.s’ demo (on our CS 630 website) to observe the APIC’s ‘periodic’ interrupt-handler drawing ‘T’s onscreen • It executes for ten-milliseconds (the 8254 is used here to create that timed delay) • Try reprogramming the APIC’s Divider Configuration register, to cut the interrupt frequency in half (or perhaps to double it)

More Related