320 likes | 2.24k Views
Pentium Pro Processor Overview. Elijah W. Bass December 6, 2000. Pentium Pro Roadmap. Pentium Pro Overview Instruction Set Format Process Stages Processing Units Branch Prediction Pentium Pro/II Performance Pentium Pro/II Cache Performance. Pentium Pro Overview.
E N D
Pentium Pro Processor Overview Elijah W. Bass December 6, 2000
Pentium Pro Roadmap • Pentium Pro Overview • Instruction Set Format • Process Stages • Processing Units • Branch Prediction • Pentium Pro/II Performance • Pentium Pro/II Cache Performance
Pentium Pro Overview • The main goal in the design of the P6 family micro-architecture was to exceed the Pentium processor performance while utilizing the existing 0.6-micrometer, four-layer, metal BICMOL manufacturing process. • The Pentium Pro processor has a three-way superscalar architecture, permitting the execution of up to three instructions per clock cycle. • The P6 superscalar implementation has dynamic execution i.e. micro-data flow analysis, out-of-order execution, superior branch prediction, and speculative execution. Object code is decoded by three instruction decode units working in parallel.
Pentium Pro Instruction Set Format • Pentium Pro executes the x86 CISC set • Pentium Pro decodes and translates each Intel x86 instruction into micro-operations • The Pentium Pro processes instructions in three stages
Pentium Pro Process Stages • The first stage consists of the instruction begin fetched, decoded, and converted into micro-ops • Reservation station (RS) is the buffer between the first and second stages • The second stage consists of the micro-operations being executed in the out-of-order core • The third stage retires the micro-operations in original program order • Completed micro-operations wait in the reorder buffer until all of the preceding instructions have been retired
Pentium Pro Processing Units • P Pro micro-architecture pipeline is divided into four sections • 8K L1 caches • 256KB, 512KB, 1MB, or 2MB L2 caches • the front end, • the out-of-order execution core • the retire section
P Pro Branch Prediction • Predicted to jump the next time if the counter is in state 2 or 3 • Predicted to not jump if in state 0 or 1 • P Pro branch prediction mechanism can learn to recognize repetitive patterns
Pentium Pro/II Performance • The Pentium II L2 cache utilizes standard commodity SRAM • The faster core frequencies and larger Pentium II L1 cache size often compensate for the slower L2 cache speed
Pentium Pro/II Cache Performance • P II L2 Cache is located off the processor die • P II L2 Cache operates at ½ core frequency • P II cacheability limit is 512 MB • P Pro cacheability limit is 4 GB
Pentium Pro/II Memory Access • Pentium Pro 200/256 out perform P II 233/512 • Pentium Pro 200/1024 performs similar to P II 266/512
Pentium Pro/II Applicability • P II outperform P Pro in CPU intensive apps • P Pro outperform PII in memory intensive apps