540 likes | 567 Views
ECE 371 Microprocessors Chapter 10 Alpha Processor Architecture. Herbert G. Mayer, PSU Status 10/1/2015 For use at CCUT Fall 2015 Created by Jessica Bare Summer 2015 at PSU. ECE 587 August 4, 2015. Alpha AXP Architecture.
E N D
ECE 371 Microprocessors Chapter 10Alpha Processor Architecture Herbert G. Mayer, PSU Status 10/1/2015 For use at CCUT Fall 2015 Created by Jessica Bare Summer 2015 at PSU
ECE 587 August 4, 2015 Alpha AXP Architecture R. Sites, 'Alpha AXP architecture', Digital Technical Journal Vol. 4 No. 4 Special Issue 1992 presented by Jessica Bare
Introduction Originally known as Alpha AXP Design and development of the Alpha uP started in the 1990s by Digital Equipment Corporation (DEC) as a general-purpose, 64-bit, successor of its long line of 32-bit VAX computers Specification published in 1992 in the digital technical, THE technical journal by Digital Equipment Corporation (DEC) Republished in 1993 in “Communications of the ACM”, now more frequently cited from CACM, reaching a wider audience IP rights sold to Compaq in 1998, only to be further acquired by Intel Corp. in 2001 Since Alpha would compete with Intel’s newly developed Itanium 64-bit architecture, Intel effectively let the Alpha product die, for the benefit of its (and HP’s) IPA
Introduction HP purchased Compaq in 2001, with commitment to support existing Alpha customers until 2006, extended to 2007 At which time a beautiful, regular, 64-bit RISC processor was sent to die
Alpha Processor AXP 21064 DEC Alpha AXP 21064 Microprocessor die photo
Paper Background • Published in 1992 • Digital Technical Journal • Journal that documented what was being developed at Digital Equipment Corporation Alpha AXP Architecture - Summer 2015 Cover of Digital Technical Journal Volume 4 Number 4 Special Edition 1992
Who is Richard L. Sites? • BS Mathematics at MIT • Computer Architecture at University North Carolina • PhD Computer Science at Stanford • IMB, HP, Burroughs, UC San Diego • At the time the paper was published (1992): • Senior Consultant in the Semiconductor Engineering Group at Digital • VAX implementations • Binary Translation • Alpha AXP Architecture • Currently works at Google • As of March 2015 holds 38 patents Alpha AXP Architecture - Summer 2015
Why develop Alpha AXP? • Grew out of a task force to explore ways to preserve the VAX VMS customer base through the 1990s • RISC instruction set developed by Digital Equipment designed to replace their VAX CISC instruction set • Most prominently used in a variety of DEC workstations and servers, which eventually formed the basis for almost all of their mid-to-upper-scale lineup DEC VAX-11/780 Alpha AXP Architecture - Summer 2015
Paper Abstract The Alpha AXP 64-bit computer architecture is designed for high performance and longevity. Because of the focus on multiple instruction issue, the architecture does not contain facilities such as branch delay slots, byte writes, and precise arithmetic exceptions. Because of the focus on multiple processors, the architecture does contain a careful shared-memory model, atomic-update primitive instructions, and relaxed read/write ordering. The first implementation of the Alpha AXP architecture is the world's fastest single-chip microprocessor. The DECchip 21064 runs multiple operating systems and runs native compiled programs that were translated from the VAX and MIPS architectures. Alpha AXP Architecture - Summer 2015
Paper Abstract • Designed for high performance and longevity • Multiple instruction issue • Does not contain • Branch delay slots • Byte writes • Precise arithmetic exceptions • Does contain • Careful shared-memory model • Atomic-update primitive instructions • Relaxed read/write ordering • Runs multiple operating systems • Runs native compiled programs that were translated from the VAX and MIPS architectures Alpha AXP Architecture - Summer 2015 Henri MatisseThe Yellow Curtain
“THUS IN ALL THESE CASES THE ROMANS DID WHAT ALL WISE PRINCES OUGHT TO DO; NAMELY, NOT ONLY TO LOOK TO ALL PRESENT TROUBLES, BUT ALSO TO THOSE IN THE FUTURE, AGAINST WHICH THEY PROVIDED WITH THE UTMOST PRUDENCE.”-NICCOLOA MACHIAVELLI, THE PRINCE Alpha AXP Architecture - Summer 2015
MAJOR GOALS • High Performance • Longevity • Capability to run VMS and UNIX OS • Easy Migration from MIPS and VAX Alpha AXP Architecture - Summer 2015
Architecture distinct from implementation • “Computer architecture is defined as the attributes and behavior of a computer as seen by a machine-language programmer. This definition includes the instruction set, instruction formats, operation codes, addressing modes, and all registers and memory locations that may be directly manipulated by a machine-language programmer.” • “Implementation is defined as the actual hardware structure, logic design, and data-path organization of a particular embodiment of the architecture.” Alpha AXP Architecture - Summer 2015
Instruction Set Architecture • Class of ISA • Memory Addressing • Addressing Modes • Types and Sizes of Operands • Operations • Control Flow Instructions • Encoding Alpha AXP Architecture - Summer 2015
Class of ISALoad / Store • All data is moved between registers and memory without computation, and all computation is done between values in registers • Provides a separate address unit to allow load and stores to execute with operate instructions. • Relaxed read/write ordering Alpha AXP Architecture - Summer 2015
Memory AddressingAligned • Assumed that most memory operands would be aligned • Alpha implementation impose a significant performance penalty when accessing quadword operands that are not naturally aligned • Normal load or store instructions that specify an unaligned address take a precise data alignment trap to PALcode (which may do the access using two aligned accesses or report a fatal error, depending on the operating system design). Alpha AXP Architecture - Summer 2015
PALCODEPRIVILEGED ARCHITECTURE LIBRARY CODE • PALcode • Set of privileged software subroutines • Access to hardware implementation • Different sets for different OS • PALcode holds the underpinning of • Interrupt delivery and return • Exception • Context switching • Memory management • Error handling Alpha AXP Architecture - Summer 2015
PALCODEPRIVILEGED ARCHITECTURE LIBRARY CODE • When an event occurs that invokes PALcode, the Alpha microprocessor does the following: • Drains the pipeline. • Loads the current PC into the EXC_ADDR internal process register. • Dispatches to the appropriate PALcode routine • PALcode is standard machine code with implementation-specific extensions that allow access to the PAL_TEMP and the control and status registers of Alpha microprocessors. Alpha AXP Architecture - Summer 2015
Addressing Modes • Assumes Little-endian byte addressing but can be set for Big-endian data • A 64-bit program counter (PC) contains a longwordaligned virtual byte address (i.e., the low 2 bits of the PC are always zero). • 16-bit displacement in load/store instructions cannot span more than 64KB. Alpha AXP Architecture - Summer 2015
Types and Size of Operands • Defined canonical forms for 3 fundamental data types • Supported integer, IEEE floating point and VAX floating point Alpha AXP Architecture - Summer 2015
Operations • Load/store, Byte Manipulation • Floating Point Load/Store • Address/Constant • Integer Computation and Conditional Move • Integer Branch • Floating Point Branch • Floating Point Computation and Conditional Move • System Alpha AXP Architecture - Summer 2015
*Load Locked *Store Conditional • Atomic-update primitive instructions Conditional Move Alpha AXP Architecture - Summer 2015
Conditional move Alpha AXP Architecture - Summer 2015
CONTROL FLOW • The Alpha architecture facilitates pipelining multiple instances of the same operations because there are no special registers and no condition codes. • Uses conditional move along with all the conditional branch, unconditional jump and call/return Alpha AXP Architecture - Summer 2015
Encoding • RISC - Reduced instruction set computer • characterized by fixed-length instructions • simple memory addressing modes • strict decoupling of load /store memory access instructions from register-to register arithmetic instructions. Alpha AXP Architecture - Summer 2015 RISC designs express computation as many simple steps
Instruction Formats • Four formats • Operate • Memory • Branch • Call_PAL • 32 bits wide • Register Rules • RB is never written • RC is never read • RC = RA operate RB Alpha AXP Architecture - Summer 2015
What else did they define as part of the architecure? Alpha AXP Architecture - Summer 2015
Hardware State Alpha AXP Architecture - Summer 2015
Register File • Better supported aggressive multiple issue • Better two chip implementations • 64 bit Program Counter (PC) • Specify IEEE rounding mode Alpha AXP Architecture - Summer 2015
Hardware State Alpha AXP Architecture - Summer 2015
Lock Flag • Careful shared memory model • Used for load-locked/store-conditional • The underlying primitive for safe updating of a multiprocessor-shared memory location is a sequence of RISC instructions: • load-locked • in-register modify • store-conditional • Test Alpha AXP Architecture - Summer 2015
Hardware State Alpha AXP Architecture - Summer 2015
Prefetch (optional) • Two optional registers to allow prefetching • Optional interrupt flag for use only by translated VAX OpenVMS AXP programs that reproduce complex instruction set computer (CISC*) Alpha AXP Architecture - Summer 2015
Hardware State Alpha AXP Architecture - Summer 2015
Palcode State • Can vary from implementation to implementation Process control block base for context switching Processor State Word Process unique value for threads Processor number for multiprocessor dispatching Alpha AXP Architecture - Summer 2015 Translation Look-Aside buffers for mapping instruction-stream and data-stream virtual addresses
Virtual Addressing • Virtual addresses are a full 64 bits • Implementations can restrict to anything above 43 bits • Fixed page sizes • Each implementation may have a page size of 8KB, 16KB, 32KB, or 64KB. • In a multiprocessor implementation, shared main memory locations have the same physical address on all processors. Alpha AXP Architecture - Summer 2015
Multiple Instruction Issue • Sought to eliminate any mechanism that would hinder aggressive multiple instruction issue implementations. • Tried to avoid all special or hidden processor resources • restricts the instruction pairs for multiple issue Alpha AXP Architecture - Summer 2015
DOES NOT CONTAIN… • Branch Delay Slot • With multiple issue 1 slot not that helpful • Byte Writes • Must do 64 bit writes to maximize performance. Byte would involve a lot of extra manipulation • Precise Arithmetic Exceptions • They’re exceptions, not the rule • Better performance is possible without them • Still has precise paging exceptions Alpha AXP Architecture - Summer 2015
The first implementation was the DEC21064 Alpha AXP Architecture - Summer 2015
DEC21x64 • The 21064 microprocessor is designed in the CMOS4 process • The 21064 The 21064 microprocessor is the first implementation of the Alpha AXP architecture. • 1.68 million transistors Alpha AXP Architecture - Summer 2015
DEC21064 Alpha AXP Architecture - Summer 2015
Pipeline • the integer and floating point pipelines are, respectively, seven- and 10-stages deep • the chip uses 45 different bypass paths • All register conflict checking is done in hardware • Up to 22 operations thus can be in various stages of completion Alpha AXP Architecture - Summer 2015
Branch Prediction • Static branch prediction • Forward = not taken • Backward = taken • 2K by 1-bit branch history table for dynamic prediction • 80% accuracy of branch prediction Alpha AXP Architecture - Summer 2015
HOW’D THEY DO WITH THOSE GOALS? Alpha AXP Architecture - Summer 2015
MAJOR GOALS • High Performance • Longevity • Capability to run VMS and UNIX OS • Easy Migration from MIPS and VAX Alpha AXP Architecture - Summer 2015
High Performance & Longevity 1992 ○ 15-25 year design horizon ○2007-2017 ○That’s now! • 64 bit architecture with 32 bit compatibility • Estimated that raw clock rates would improve by a factor of 10 • Alpha AXP unveiled at 150MHz • Intel i7 runs at 1.2GHz – 4.0GHz • Estimated multiple issue would sustain up to 10 instructions started per clock • Out of Order Instruction • Estimated 10 processors with shared memory • Intel i7 has 4 physical cores and 8 logical cores Alpha AXP Architecture - Summer 2015
Capability to run VMS and UNIX OS • Also ran windows NT • Used all that PALcode to make it operating system independent Alpha AXP Architecture - Summer 2015
Easy migration from VAX and MIPS architectures • Very little discussion of this goal in the AXP architecture paper • Used Binary Translation which got its own paper • A software technique to change an executable program written for one architecture /operating-system pair into an equivalent program for a different architecture/operating-system pair. Alpha AXP Architecture - Summer 2015
SO WHY AREN’T WE ALL USING ALPHA AXP MACHINES? Alpha AXP Architecture - Summer 2015