230 likes | 409 Views
Real instruction set architectures. Part 2: internal CPU storage, overview of Intel architectures. Big-Endian vs. Little-Endian: quick recap.
E N D
Real instruction set architectures Part 2: internal CPU storage, overview of Intel architectures
Big-Endian vs. Little-Endian: quick recap • In a big-endian machine, bytes used to store a data item are arranged left to right, so that the MSB is found at the leftmost position (first byte of address, the “big end”) • Little-endian is just the opposite; bytes are arranged right to left, with the MSB as the first bit of the last byte (the “little end”) • Note that, in either case, bits within each byte are arranged left to right – so a little-endian integer isn’t exactly the same thing as a big-endian integer backwards
Byte ordering & data movement • Computer networks are big endian: • Little endian machines must convert integers (e.g. network device addresses) before they can be passed over the network • Little endian machines must also convert integers retrieved from the network to the native mode for the machine
Byte ordering & data movement • Any program that reads/writes file data must be aware of byte ordering • For example, Windows BMPs were developed on a little endian machine; an application on a big endian machine that reads a BMP must reverse byte order • PhotoShop, JPEG, MacPaint, Sun raster files: big endian • GIF, PC Paintbrush, RTF: little endian
Internal CPU storage • 3 choices for data storage in CPU: • Stack architecture: • Use stack to execute instructions; operands stored at top of stack • No random access • Accumulator architecture: • Minimum of internal complexity; short instructions • One (implicit) operand stored in accumulator • Involves high volume of memory traffic • General Purpose register: see next slide
General Purpose Register (GPR) • Set (>1) of GPRs • Most common architecture in use today • Registers are faster than memory; easier for other parts of the CPU to handle register data (than data from memory) • Cheaper hardware tends to mean an increased number of registers in the CPU • GPRs mean longer instructions, because register(s) must be specified; takes more time to fetch/decode longer instructions
Classification of GPR architectures • Memory to memory (VAX): • Instruction uses 2-3 operands, stored in memory • Instructions can perform operations without involving registers • Register to memory (Intel, Motorola): at least one operand must be in a register • Load-store (SPARC, MIPS, Alpha, PowerPC): Requires movement of data to registers before any operations performed
Operand number / instruction length • Instructions can be formatted 2 ways: • Fixed-length: fast, but wastes space • Variable-length: more complex to decode, but saves space • Real-life compromise often involves 2-3 instruction lengths (so fixed, but variable)
Some historical architectures • VAX: Digital’s line of midsize computers, dominant in academia in the 70s and 80s • Characteristics: • Variable-length instructions; anywhere from 2 to 5 operands • Full set of addressing modes: operands can be anywhere; single instruction could take up to 31 bytes • “High level” instructions: complexity built into instruction set to make programmers’ task easier • Extensive set of data types at machine level
Some historical architectures • Motorola’s 68000 series • Initial Apple MacIntosh, early Sun workstations • Variable-length instructions: 0-2 operands • Wide variety of addressing modes (but not as many as VAX) • Could not start an instruction until previous one was completed
Intel architectures • 8086 chip: first produced in 1979 • Handled 16-bit data, 20-bit addresses • Could address 1 million bytes of memory • CPU split into 2 parts: • Execution unit: contained GPRs & ALU • Bus interface unit: included instruction queue, segment registers, instruction pointer (SR & IP are special-purpose registers)
8086 GPRs • AX: accumulator • BX: base register: could be used to extend addressing • CX: count register • DX: data register • Some 8086 instructions require use of specific GPR, but in general, could use any of these to hold data
Byte-level addressing • Each GPR addressable at word or byte level • For example, AX divided into: • AH (contains MSB) • AL (contains LSB) • Same for BX, CX, DX
Other registers in 8086 • Pointer registers: • SP: stack pointer: used as offset into stack • BP: base pointer: used to reference parameters pushed on stack; indicates lowest value SP can reach • IP: holds address of next instruction (like Pep/8’s PC) • Index registers: • SI: source index; used as source pointer for string operations • DI: destination index; used as destination pointer for string operations • Both SI & DI sometimes used to supplement GPRs
Other registers in 8086 • Status flags register: bits indicate CPU status & results (overflow, carry, negative, etc.) • Segment registers • 8086 assembly language programs divided into specialized blocks of code called segments • Each segment holds specific types of information
8086 Segments • Code segment: program itself (instructions) • Data segment: program data • Stack segment: program’s runtime stack (for procedure calls)
8086 segments • To access information in a segment, had to specify item’s offset from segment start • Segment needed to store segment addresses – these were stored in segment registers: • CS: code segment • DS: data segment • SS: stack segment • ES: extra segment (used by some string operations to handle memory addressing) • Addresses specified in segment/offset form: XXX:YYY Where XXX is the value stored in a segment register, and YYY is the offset from the start of the segment
Evolution of Intel platform • Basic 8086 ISA used in many successor chips: • 8087 • Introduced in 1980 • Added floating-point instructions, 80-bit stack • 80286 • Introduced 1982 • Could address up to 16Mb of memory
Evolution of Intel platform • 80386 • Could address 4Gb of RAM • 32-bit chip, with 32-bit bus, 32-bit word • To achieve backward compatibility, Intel kept same basic architecture, register sets • Used new naming convention in registers: EAX, EBX, etc. were 32-bit (extended) versions of AX, BX, etc.; could still access original 16-bit registers (and their byte components) using original names
Evolution of Intel platform • 80486 • Added high-speed cache memory for performance improvement • Integrated math co-processor • Pentium™ series • Intel quit using numbers: couldn’t trademark them • 32-bit registers, 64-bit bus • Employed superscalar design, with multiple ALUs; could run instructions in parallel, handling more than one instruction per clock cycle
Pentium™ series • Pro added branch prediction • II added MMX • III added increased support for 3D graphics using floating-point instructions • P4: 1.4 GHz and higher clock rates; 42 million transistors per CPU; 400MHz (and faster) system bus, refinements to cache & floating-point operations
Pentium™ series • Itanium: Intel’s first 64-bit chip • Employs hardware emulator to maintain backward compatibility with x86 • 4 integer ALUs, 4 floating-point ALUs, 4 cache levels, 128 bit registers for integers and floating-point numbers • Multiple miscellaneous registers for dealing with efficient instruction loading for branching • Addresses up to 16Gb of RAM
CISC vs. RISC • CISC: complex instruction set computing • Employed by Intel up through Pentium Pro • Pentium II and III used combined CISC/RISC: CISC architecture with RISC core that could translate CISC instructions to RISC • RISC: reduced instruction set computing • CISC emphasizes complexity in hardware, simplicity in software; RISC is opposite • RISC is generally considered superior in performance