840 likes | 1.09k Views
COS2014 IA-32 Processor Architecture. Overview. Goal: Understand IA-32 architecture Basic Concepts of Computer Organization Instruction execution cycle Basic computer organization Data storage in memory How programs run IA-32 Processor Architecture IA-32 Memory Management
E N D
Overview Goal: Understand IA-32 architecture • Basic Concepts of Computer Organization • Instruction execution cycle • Basic computer organization • Data storage in memory • How programs run • IA-32 Processor Architecture • IA-32 Memory Management • Components of an IA-32 Microcomputer • Input-Output System
Recall: Computer Model for ASM MOV AX, a ADD AX, b MOV x, AX … x a + b Memory a 010100110010101 Register CPU b AX 110010110001010 BX PC ... x 000000000010010 + - ALU
11 0000001 1000110 ADD Meanings of the Code (assumed) 01 0000001 1000010 MOV register address memory address AX a 01 1000011 0000001 Assembly code Machine code MOV AX, a (Take the data stored inmemory address ‘a’, and move it to register AX) ADD AX, b (Take the data stored inmemory address ‘b’, and add it to register AX) MOV x, AX (Take the data stored inregister AX, and move it tomemory address ‘x’)
Another Computer Model for ASM Stored program architecture Processor Register Memory AX … BX x a data b 01 0000001 1000010 MOV AX, a 11 0000001 1000110 ADD AX, b 01 1000011 0000001 MOV x, AX ALU … IR PC PC: program counter IR: instruction register address
Step 1: Fetch (MOV AX, a) Register Memory AX … BX x a data b 01 0000001 1000010 MOV AX, a 11 0000001 1000110 ADD AX, b 01 1000011 0000001 MOV x, AX ALU … IR PC 01 0000001 1000010 0000111 address
Step 2: Decode (MOV AX,a) Register Memory AX … BX x a data Controller b 01 0000001 1000010 MOV AX, a 11 0000001 1000110 ADD AX, b 01 1000011 0000001 MOV x, AX ALU … clock IR PC 01 0000001 1000010 0000111 address
Step 3: Execute (MOV AX,a) Register Memory AX … 00000000 00000001 BX x 00000000 000000001 a data Controller b 01 0000001 1000010 MOV AX, a 11 0000001 1000110 ADD AX, b 01 1000011 0000001 MOV x, AX ALU … clock IR PC 01 0000001 1000010 0000111 address
Step 1: Fetch (ADD AX,b) Register Memory AX … 00000000 00000001 BX x a data b 01 0000001 1000010 MOV AX, a 11 0000001 1000110 ADD AX, b 01 1000011 0000001 MOV x, AX ALU … IR PC 11 0000001 1000110 0001000 address
Step 2: Decode (ADD AX,b) Register Memory AX … 00000000 00000001 BX x a data Controller b 01 0000001 1000010 MOV AX, a 11 0000001 1000110 ADD AX, b 01 1000011 0000001 MOV x, AX ALU … clock IR PC 11 0000001 1000110 0001000 address
Step 3a: Execute (ADD AX,b) Register Memory AX … 00000000 00000001 BX x a data Controller 00000000 00000010 b 01 0000001 1000010 MOV AX, a 11 0000001 1000110 ADD AX, b + 01 1000011 0000001 MOV x, AX ALU … 00000000 00000011 clock IR PC 11 0000001 1000110 0001000 address
Step 3b: Write Back (ADD AX,b) Register Memory AX 00000000 00000011 00000000 00000001 … BX x a data Controller b 01 0000001 1000010 MOV AX, a 11 0000001 1000110 ADD AX, b 01 1000011 0000001 MOV x, AX ALU … 00000000 00000011 clock IR PC 11 0000001 1000110 0001000 address
Basic Computer Organization • Clock synchronizes CPU operations • Control unit (CU) coordinates execution sequence • ALU performs arithmetic and bitwise processing
Clock • Operations in a computer are triggered and thus synchronized by a clock • Clock tells “when”: (no need to ask each other!!) • When to put data on output lines • When to read data from input lines • Clock cycle measures time of a single operation • Must long enough to allow signal propagation
Instruction/Data for Operations • Where are the instructions needed for computer operations from? • Stored-program architecture: • The whole program is stored in main memory, including program instructions (code) and data • CPU loads the instructions and data from memory for execution • Don’t worry about the disk for now • Where are the data needed for execution? • Registers (inside the CPU, discussed later) • Memory • Constant encoded in the instructions
Memory • Organized like mailboxes, numbered 0, 1, 2, 3,…, 2n-1. • Each box can hold 8 bits (1 byte) • So it is called byte-addressing • Address of mailboxes: • 16-bit address is enough for up to 64K • 20-bit for 1M • 32-bit for 4G • Most servers need more than 4G!! That’s why we need 64-bit CPUs like Alpha (DEC/Compaq/HP) or Merced (Intel) …
Storing Data in Memory • Character String: • So how are strings like “Hello, World!” are stored in memory? • ASCII Code! (or Unicode…etc.) • Each character is stored as a byte • Review: how is “1234” stored in memory? • Integer: • A byte can hold an integer number: • between 0 and 255 (unsigned) or • between –128 and 127 (2’s complement) • How to store a bigger number? • Review: how is 1234 stored in memory?
Big or Little Endian? • Example: 1234 is stored in 2 bytes. = 100 1101 0010 in binary = 04 D2 in hexadecimal • Do you store 04 or D2 first? • Big Endian: 04 first • Little Endian: D2 first Intel’s choice • Reason: more consistent for variable length (e.g., 2 bytes, 4 bytes, 8 bytes…etc.)
Cache Memory • High-speed expensive static RAM both inside and outside the CPU. • Level-1 cache: inside the CPU chip • Level-2 cache: often outside the CPU chip • Cache hit: when data to be read is already in cache memory • Cache miss: when data to be read is not in cache memory
Load and Execute Process • OS searches for program’s filename in current directory and then in directory path • If found, OS reads information from directory • OS loads file into memory from disk • OS allocates memory for program information • OS executes a branch to cause CPU to execute the program. A running program is called a process • Process runs by itself. OS tracks execution and responds to requests for resources • When the process ends, its handle is removed and memory is released How? OS is only a program!
Multitasking • OS can run multiple programs at same time • Multiple threads of execution within the same program • Scheduler utility assigns a given amount of CPU time to each running program • Rapid switching of tasks • Gives illusion that all programs are running at the same time • Processor must support task switching What supports are needed from hardware?
What's Next • General Concepts • IA-32 Processor Architecture • Modes of operation • Basic execution environment • Floating-point unit • Intel microprocessor history • IA-32 Memory Management • Components of an IA-32 Microcomputer • Input-Output System
Modes of Operation • Protected mode • native mode (Windows, Linux) • Programs are given separate memory areas named segments • Real-address mode • native MS-DOS • System management mode • power management, system security, diagnostics • Virtual-8086 mode • hybrid of Protected • each program has its own 8086 computer
Basic Execution Environment Address space: • Protected mode • 4 GB • 32-bit address • Real-address and Virtual-8086 modes • 1 MB space • 20-bit address
Register Memory Controller ALU N Z IR PC clock Basic Execution Environment • Program execution registers: named storage locations inside the CPU, optimized for speed
General Purpose Registers • Used for arithmetic and data movement • Addressing: • AX, BX, CX, DX: 16 bits • Split into H and L parts, 8 bits each • Extended into E?X to become 32-bit register (i.e., EAX, EBX,…etc.)
Index and Base Registers • Some registers have only a 16-bit name for their lower half:
Some Specialized Register Uses • General purpose registers • EAX: accumulator, automatically used by multiplication and division instructions • ECX: loop counter • ESP: stack pointer • ESI, EDI: index registers (source, destination) for memory transfer, e.g. a[i,j] • EBP: frame pointer to reference function parameters and local variables on stack • EIP: instruction pointer (i.e. program counter)
Some Specialized Register Uses • Segment registers • In real-address mode: indicate base addresses of preassigned memory areas named segments • In protected mode: hold pointers to segment descriptor tables • CS: code segment • DS: data segment • SS: stack segment • ES, FS, GS: additional segments • EFLAGS • Status and control flags (single binary bits) • Control the operation of the CPU or reflect the outcome of some CPU operation
Register Memory Controller ALU N Z IR PC clock Status Flags (EFLAGS) Reflect the outcomes of arithmetic and logical operations performed by the CPU • Carry: unsigned arithmetic out of range • Overflow: signed arithmetic out of range • Sign: result is negative • Zero: result is zero • Auxiliary Carry: carry from bit 3 to bit 4 • Parity: sum of 1 bits is an even number
System Registers Application programs cannot access system registers • IDTR (Interrupt Descriptor Table Register) • GDTR (Global Descriptor Table Register) • LDTR (Local Descriptor Table Register) • Task Register • Debug Registers • Control registers CR0, CR2, CR3, CR4 • Model-Specific Registers
Floating-Point, MMX, XMM Reg. • Eight 80-bit floating-point data registers • ST(0), ST(1), . . . , ST(7) • arranged in a stack • used for all floating-point arithmetic • Eight 64-bit MMX registers • Eight 128-bit XMM registers forsingle-instruction multiple-data(SIMD) operations
Intel Microprocessors • Early microprocessors: • Intel 8080: • 64K addressable RAM, 8-bit registers • CP/M operating system • S-100 BUS architecture • 8-inch floppy disks! • Intel 8086/8088 • IBM-PC used 8088 • 1 MB addressable RAM, 16-bit registers • 16-bit data bus (8-bit for 8088) • separate floating-point unit (8087) This is where “real-address mode” comes from!
Intel Microprocessors • The IBM-AT Intel • 80286 • 16 MB addressable RAM • Protected memory • Introduced IDE bus architecture • 80287 floating point unit • Intel IA-32 Family • Intel386: 4 GB addressable RAM, 32-bit registers, paging (virtual memory) • Intel486: instruction pipelining • Pentium: superscalar, 32-bit address bus, 64-bit internal data path
Intel Microprocessors • Intel P6 Family • Pentium Pro: advanced optimization techniques in microcode • Pentium II: MMX (multimedia) instruction set • Pentium III: SIMD (streaming extensions) instructions • Pentium 4 and Xeon: Intel NetBurst micro-architecture, tuned for multimedia
What’s Next • General Concepts of Computer Architecture • IA-32 Processor Architecture • IA-32 Memory Management • Real-address mode • Calculating linear addresses • Protected mode • Multi-segment model • Paging • Components of an IA-32 Microcomputer • Input-Output System Understand it from the view point of the processor
Real-address Mode • Programs assigned 1MB (220) of memory • Programs can access any area of memory • Can run only one program at a time • Segmented memory scheme • 16-bit segment * 10h + 16-bit offset = 20-bit linear (or absolute) address • Segment value in CS, SS, DS, ES • Offset value in IP, SP, BX & SI, DI
Example • Accessing a variable in the data segment • DS (data segment) = 0A43 (16-bits) • BX (offset) = 0030 (16-bits) • 0A43 * 10 = 0A430 (20-bits) + 0030 (16-bits) 0A460 (linear address)
Segmented Memory • Segmented memory addressing: absolute (linear) address is a combination of a 16-bit segment value (in CS, DS, SS, or ES) added to a 16-bit offset linear addresses one segment represented as 8000 0000 segment value offset
Calculating Linear Addresses • Given a segment address, multiply it by 16 (add a hexadecimal zero), and add it to the offset all done by the processor • Example:convert 08F1:0100 to a linear address Adjusted Segment value: 0 8 F 1 0 Add the offset: 0 1 0 0 Linear address: 0 9 0 1 0
Protected Mode • Designed for multitasking • Each process (running program) is assigned a total of 4GB of addressable RAM • Two parts: • Segmentation: provides a mechanism of isolating individual code, data, and stack so that multiple programs can run without interfering one another • Paging: provides demand-paged virtual memory where sections of a program’s execution environ. are moved into physical memory as needed Give segmentation the illusion that it has 4GB of physical memory
Segmentation in Protected Mode • Segment: a logical unit of storage (not the same as the “segment” in real-address mode) • e.g., code/data/stack of a program, system data structures • Variable size • Processor hardware provides protection • All segments in the system are in the processor’s linear address space (physical space if without paging) • Need to specify: base address, size, type, … segment descriptor & descriptor table linear address = base address + offset
Flat Segment Model • Use a single global descriptor table (GDT) • All segments (at least 1 code and 1 data) mapped to entire 32-bit address space
Multi-Segment Model • Local descriptor table (LDT) for each program • One descriptor for each segment located in a system segment of LDT type
Segmentation Addressing • Program references a memory location with a logical address: segment selector + offset • Segment selector: provides an offset into the descriptor table • CS/DS/SS points to descriptor table for code/data/stack segment
Convert Logical to Linear Address Segment selector points to a segment descriptor, which contains base address of the segment. The 32-bit offset from the logical address is added to the segment’s base address, generating a 32-bit linear address
Paging • Supported directly by the processor • Divides each segment into 4096-byte blocks called pages • Part of running program is in memory, part is on disk • Sum of all programs can be larger than physical memory • Virtual memory manager (VMM): An OS utility that manages loading and unloading of pages • Page fault: issued by processor when a page must be loaded from disk
What's Next • General Concepts • IA-32 Processor Architecture • IA-32 Memory Management • Components of an IA-32 Microcomputer • Skipped … • Input-Output System
What's Next • General Concepts • IA-32 Processor Architecture • IA-32 Memory Management • Components of an IA-32 Microcomputer • Input-Output System • How to access I/O systems?