Prelude to Multiprocessing

Prelude to Multiprocessing Detecting cpu and system-board capabilities with CPUID and the MP Configuration Table

CPUID • Recent Intel processors provide a ‘cpuid’ instruction (opcode 0x0F, 0xA2) to assist software in detecting a CPU’s capabilities • If it’s implemented, this instruction can be executed in any of the processor modes, and at any privilege level • But it may not be implemented (e.g., 8086, 80286, 80386)

Pentium EFLAGS register 31 16 21 0 0 0 0 0 0 0 0 0 0 I D V I P V I F A C V M R F 15 0 0 N T IOPL O F D F I F T F S F Z F 0 A F 0 P F 1 C F Software can ‘toggle’ the ID-bit (bit #21) in the 32-bit EFLAGS register if the processor is capable of executing the ‘cpuid’ instruction

But what if there’s no EFLAGS? • The early Intel processors (8086, 80286) did not implement 32-bit registers • The FLAGS register was only 16-bits wide • So there was no ID-bit that software could try to ‘toggle’ • How can software be sure that the 32-bit EFLAGS register exists within the CPU?

Detecting 32-bit processors • There’s a subtle difference in the way the logical shift/rotate instructions work when register CL contains the shift-factor • On the 32-bit processors (e.g., 80386+) the value in CL is truncated to 5-bits, but not so on the 16-bit CPUs (8086, 80286) • Software can exploit this distinction, in order to tell if EFLAGS is implemented

Detecting EFLAGS # Here’s a test for the presence of EFLAGS mov $-1, %ax # a nonzero value mov $32, %cl # shift-factor of 32 shl %cl, %ax # do logical shift or %ax, %ax # test result in AX jnz is32bit # EFLAGS present jmp is16bit # EFLAGS absent

Testing for ID-bit ‘toggle’ # Here’s a test for the presence of the CPUID instruction pushfl # copy EFLAGS contents pop %eax # to accumulator register mov %eax, %edx # save a duplicate image btc $21, %eax # toggle the ID-bit (bit 21) push %eax # copy revised contents popfl # back into EFLAGS pushfl # copy EFLAGS contents pop %eax # back into accumulator xor %edx, %eax # do XOR with prior value bt $21, %eax # did ID-bit get toggled? jc y_cpuid # yes, can execute ‘cpuid’ jmp n_cpuid # else ‘cpuid’ unimplemented

How does CPUID work? • Step 1: load value 0 into register EAX • Step 2: execute ‘cpuid’ instruction • Step 3: Verify ‘GenuineIntel’ character- string in registers (EBX,EDX,ECX) • Step 4: Find maximum CPUID input-value in the EAX register

Version and Features • load 1 into EAX and execute CPUID • Processor model and stepping information is returned in register EAX • 20 19 16 13 12 11 8 7 4 3 0 Extended Family ID Extended Model ID Type Family ID Model Stepping ID

Some Feature Flags in EDX 28 H T T 9 3 2 1 0 A P I C P S E D E V M E F P U HTT = HyperThreading Technology (1 = yes, 0 = no) APIC = Advanced Programmable Interrupt Controller on-chip (1 = yes,0 = no) PSE = Page-Size Extensions (1 = yes, 0 = no) DE = Debugging Extensions (1=yes, 0=no) VME = Virtual-8086 Mode Enhancements (1 = yes, 0 = no) FPU = Floating-Point Unit on-chil (1=yes, 0=no)

Some Feature Flags in ECX 5 V M X VMX = Virtual Machine Extensions (1 = yes, 0 = no)

Multiprocessor Specification • It’s an industry standard, allowing OS software to use multiple processors in a uniform way • Software searches in three regions of the physical address-space below 1-megabyte for a “paragraph-aligned” data-structure of length 16-bytes called the MP Floating Pointer Structure: • Search in lowest KB of Extended Bios Data Area • Search in topmost KB of conventional 640K RAM • Search in the 64KB ROM-BIOS (0xF0000-0xFFFFF)

MP Floating Pointer Structure • This structure may contain an ID-number for one a small number of standard SMP system architectures, or may contain the memory address for a more extensive MP Configuration Table whose entries specify a “more customized” system architecture • Our classroom machines employ the latter of these two options

The processor’s Local-APIC • The purpose of each processor’s APIC is to allow CPUs in a multiprocessor system to transmit messages among one another and to manage the delivery of interrupts from the various peripheral devices to one or more CPUs in a dynamically determined way • The Local-APIC has a variety of registers which are ‘memory mapped’ to paragraph-aligned addresses in the 4KB page at 0xFEE00000

Local-APIC’s register-space APIC 0xFEE00000 4GB physical address-space RAM 0x00000000

Each CPU has its own timer! • Four of the Local-APIC registers are used to implement a programmable timer • It can privately deliver a periodic interrupt just to its own CPU • 0xFEE00320: Timer Vector register • 0xFEE00380: Initial Count register • 0xFEE00390: Current Count register • 0xFEE003E0: Divider Configuration register

Timer’s Local Vector Table 0xFEE00320 7 0 12 17 16 M O D E M A S K B U S Y Interrupt ID-number MODE: 0=one-shot 1=periodic MASK: 0=unmasked 1=masked BUSY: 0=not busy 1=busy

In-class exercise • Run the ‘cpuid.cpp’ Linux application (on our course website) to see if the CPUs in our classroom implement HyperThreading (i.e., multiple processors within one CPU) • Then run the ‘smpinfo.cpp’ application, to see if the MP Base Configuration Table has entries for more than one processor • If both results hold true, then we can write our own multiprocessing software in here!

In-class exercise #2 • Run the ‘apictick.s’ demo (on our website) to observe the APIC’s periodic interrupt drawing bytes onto the screen • It executes for ten-milliseconds (the 8254 is used to create this timed delay) • Try reprogramming the APIC’s Divider Configuration register, to cut the interrupt frequency in half (or to double it)

Prelude to Multiprocessing

Prelude to Multiprocessing

Presentation Transcript

Prelude to War

Prelude to Fusebox

Prelude to Revolution

Prelude to War

Prelude to Disunion

Prelude to WWII

Prelude to War

Prelude to War

Prelude to War

Prelude to WWII

PRELUDE TO WAR

Prelude to War

Prelude to Fusebox

Prelude to War

Prelude to Interviews

Prelude to War

Prelude to War

Prelude to War

Prelude to Exploration

Prelude to Revolution

Prelude to Multiprocessing