1 / 19

Prelude to Multiprocessing

This article explains how to detect a CPU's capabilities using the CPUID instruction and the MP Configuration Table. It also covers the process of toggling the ID-bit and testing for the presence of EFLAGS and CPUID. Additionally, it provides information on the CPUID instruction and the Local-APIC register space.

davidmking
Download Presentation

Prelude to Multiprocessing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prelude to Multiprocessing Detecting cpu and system-board capabilities with CPUID and the MP Configuration Table

  2. CPUID • Recent Intel processors provide a ‘cpuid’ instruction (opcode 0x0F, 0xA2) to assist software in detecting a CPU’s capabilities • If it’s implemented, this instruction can be executed in any of the processor modes, and at any privilege level • But it may not be implemented (e.g., 8086, 80286, 80386)

  3. Pentium EFLAGS register 31 16 21 0 0 0 0 0 0 0 0 0 0 I D V I P V I F A C V M R F 15 0 0 N T IOPL O F D F I F T F S F Z F 0 A F 0 P F 1 C F Software can ‘toggle’ the ID-bit (bit #21) in the 32-bit EFLAGS register if the processor is capable of executing the ‘cpuid’ instruction

  4. But what if there’s no EFLAGS? • The early Intel processors (8086, 80286) did not implement 32-bit registers • The FLAGS register was only 16-bits wide • So there was no ID-bit that software could try to ‘toggle’ • How can software be sure that the 32-bit EFLAGS register exists within the CPU?

  5. Detecting 32-bit processors • There’s a subtle difference in the way the logical shift/rotate instructions work when register CL contains the shift-factor • On the 32-bit processors (e.g., 80386+) the value in CL is truncated to 5-bits, but not so on the 16-bit CPUs (8086, 80286) • Software can exploit this distinction, in order to tell if EFLAGS is implemented

  6. Detecting EFLAGS # Here’s a test for the presence of EFLAGS mov $-1, %ax # a nonzero value mov $32, %cl # shift-factor of 32 shl %cl, %ax # do logical shift or %ax, %ax # test result in AX jnz is32bit # EFLAGS present jmp is16bit # EFLAGS absent

  7. Testing for ID-bit ‘toggle’ # Here’s a test for the presence of the CPUID instruction pushfl # copy EFLAGS contents pop %eax # to accumulator register mov %eax, %edx # save a duplicate image btc $21, %eax # toggle the ID-bit (bit 21) push %eax # copy revised contents popfl # back into EFLAGS pushfl # copy EFLAGS contents pop %eax # back into accumulator xor %edx, %eax # do XOR with prior value bt $21, %eax # did ID-bit get toggled? jc y_cpuid # yes, can execute ‘cpuid’ jmp n_cpuid # else ‘cpuid’ unimplemented

  8. How does CPUID work? • Step 1: load value 0 into register EAX • Step 2: execute ‘cpuid’ instruction • Step 3: Verify ‘GenuineIntel’ character- string in registers (EBX,EDX,ECX) • Step 4: Find maximum CPUID input-value in the EAX register

  9. Version and Features • load 1 into EAX and execute CPUID • Processor model and stepping information is returned in register EAX • 20 19 16 13 12 11 8 7 4 3 0 Extended Family ID Extended Model ID Type Family ID Model Stepping ID

  10. Some Feature Flags in EDX 28 H T T 9 3 2 1 0 A P I C P S E D E V M E F P U HTT = HyperThreading Technology (1 = yes, 0 = no) APIC = Advanced Programmable Interrupt Controller on-chip (1 = yes,0 = no) PSE = Page-Size Extensions (1 = yes, 0 = no) DE = Debugging Extensions (1=yes, 0=no) VME = Virtual-8086 Mode Enhancements (1 = yes, 0 = no) FPU = Floating-Point Unit on-chil (1=yes, 0=no)

  11. Some Feature Flags in ECX 5 V M X VMX = Virtual Machine Extensions (1 = yes, 0 = no)

  12. Multiprocessor Specification • It’s an industry standard, allowing OS software to use multiple processors in a uniform way • Software searches in three regions of the physical address-space below 1-megabyte for a “paragraph-aligned” data-structure of length 16-bytes called the MP Floating Pointer Structure: • Search in lowest KB of Extended Bios Data Area • Search in topmost KB of conventional 640K RAM • Search in the 64KB ROM-BIOS (0xF0000-0xFFFFF)

  13. MP Floating Pointer Structure • This structure may contain an ID-number for one a small number of standard SMP system architectures, or may contain the memory address for a more extensive MP Configuration Table whose entries specify a “more customized” system architecture • Our classroom machines employ the latter of these two options

  14. The processor’s Local-APIC • The purpose of each processor’s APIC is to allow CPUs in a multiprocessor system to transmit messages among one another and to manage the delivery of interrupts from the various peripheral devices to one or more CPUs in a dynamically determined way • The Local-APIC has a variety of registers which are ‘memory mapped’ to paragraph-aligned addresses in the 4KB page at 0xFEE00000

  15. Local-APIC’s register-space APIC 0xFEE00000 4GB physical address-space RAM 0x00000000

  16. Each CPU has its own timer! • Four of the Local-APIC registers are used to implement a programmable timer • It can privately deliver a periodic interrupt just to its own CPU • 0xFEE00320: Timer Vector register • 0xFEE00380: Initial Count register • 0xFEE00390: Current Count register • 0xFEE003E0: Divider Configuration register

  17. Timer’s Local Vector Table 0xFEE00320 7 0 12 17 16 M O D E M A S K B U S Y Interrupt ID-number MODE: 0=one-shot 1=periodic MASK: 0=unmasked 1=masked BUSY: 0=not busy 1=busy

  18. In-class exercise • Run the ‘cpuid.cpp’ Linux application (on our course website) to see if the CPUs in our classroom implement HyperThreading (i.e., multiple processors within one CPU) • Then run the ‘smpinfo.cpp’ application, to see if the MP Base Configuration Table has entries for more than one processor • If both results hold true, then we can write our own multiprocessing software in here!

  19. In-class exercise #2 • Run the ‘apictick.s’ demo (on our website) to observe the APIC’s periodic interrupt drawing bytes onto the screen • It executes for ten-milliseconds (the 8254 is used to create this timed delay) • Try reprogramming the APIC’s Divider Configuration register, to cut the interrupt frequency in half (or to double it)

More Related