320 likes | 435 Views
phones off (please). CSCI1412 Lecture 3. Hardware 3 More Architecture Dr John Cowell. Overview. How it works! the fetch / execute cycle in detail Measuring speed system clock, GHz , MIPS and FLOPS Advanced concepts cache, pipelining, parallelism memory issues dynamic and static RAM,
E N D
phones off(please) CSCI1412Lecture 3 Hardware 3 More Architecture Dr John Cowell
Overview • How it works! • the fetch / execute cycle in detail • Measuring speed • system clock, GHz, MIPS and FLOPS • Advanced concepts • cache, pipelining, parallelism • memory issues • dynamic and static RAM, • SIMMS, DIMMS, and specialist memory • motherboards • component layout CSCI1412-HW-3
The Fetch / Execute Cycle control unit arithmetic / logic unit decode execute fetch (store) RAM CSCI1412-HW-3
Buses • Computer memory is made up of a set of locations. Each has a unique address. • The address bus specifies the location. The data bus transfers the data. • The control bus determines e.g. read or write CSCI1412-HW-3
Registers • A CPU contains special purpose registers (typically 32) • Very high speed memory within the processor chip • each register contains a fixed number of bits • e.g. each register in a 32-bit processor has 32 bits • Contain instructions to be executed, data being operated on, etc. • Typically there are several named registers • SCR sequence control register • holds location of the next piece of information to be fetched • controls the sequence of instructions • each time it is accessed, it is automatically incremented (increased) by one • CIR current instruction register • holds the instruction about to be processed CSCI1412-HW-3
More Registers • Registers, continued ... • MAR memory address register • holds the location (the address) of information about to be read from or written to RAM • MDR memory data register • holds the value of information just read from or about to be written to RAM • ACC accumulator(s) • hold result(s) of processing • Sometimes a processor also has one or more • STO general purpose store(s) • hold temporary data value(s) for processing CSCI1412-HW-3
Machine Code • Very simple low level instructions. • A single high level language instruction (e.g. VB) may require many machine code instructions. • An integral part of the processor. • An instruction has an operation code (opcode), followed by zero or more items of data (operands) CSCI1412-HW-3
location value 123416 C616 123516 1016 Machine Code • For example • in Zilog Z80 machine code (8-bit processor) • instruction C616 in hexadecimal means add the data held at the following location to the current accumulator • suppose that the SCR currently holds 123416, ACC holds 516 and the contents of memory is as shown below. • What is the sequence the registers are used in? Operation code Operand CSCI1412-HW-3
location value 123416 C616 123516 1016 Adding data to the Acc. opcode operand CSCI1412-HW-3
Sequence of Actions • Fetch SCR MAR, put address of next instruction into the MAR SCR+1 SCR, point to the next memory location MAR RAM MDR CIR, read from RAM address (MAR), into the MDR, into the CIR • Decode • Contents of CIR - instruction number C616means ... data required ... • Execute SCR MAR, put address of data into the MAR SCR+1 SCR, point to the next instruction MAR RAM MDR, read from RAM address(MAR), into the MDR • Store MDR + ACC ACC, add the MDR and Ac contents • in this case, the result in stored in the accumulator CSCI1412-HW-3
The System Clock • What controls the fetch / execute cycle? • the system clock • this is a quartz chip that provides pulses at a regular, rapid, rate, like a metronome • n.b. not the same as the real date / time clock • The first microprocessor originally ran at 100 KHz, the Pentium IV is now at 1.2 – 4.0 GHz • A clock tick starts the fetch / execute cycle • it may take several (perhaps tens of) clock ticks to complete one complex instruction CSCI1412-HW-3
Gigahertz • The ‘simplest’ measure of speed is just the rate at which the system clock ticks • usually quoted in Gigahertz (GHz) • 1 Hertz = 1 cycle per second • 1 Megahertz = 1 million cycles per second • 1 Gigahertz = 1 billion cycles per second • This is meaningful in one type of processor • e.g. 2.4 GHz Pentium is twice as quick as 1.2 GHz • But is not for comparing different processor types • different processors may take different numbers of cycles to fetch / execute the ‘same’ instruction • e.g. a Pentium takes X cycles to load a number into the accumulator, whereas a 68040 takes Y cycles CSCI1412-HW-3
MIPS • In order to overcome the limitations of GHz, some manufacturers prefer to use MIPS • millions of instructions per second • found by counting the number of cycles (on average) that a processor takes to execute an instruction • However, this is still not very helpful • which instructions !? • some instructions may be very short: LOAD ACC,0 • some instructions may be very long • store value zero into RAM from location 0x1000 to 0x1FFF • Can be found by standard benchmarks CSCI1412-HW-3
FLOPS • Perhaps, as computers are often used for mathematical calculations, a better measure would be the number of floating point operations that can be carried out per second • FLOPS: floating point operations per second • found by running standard mathematical benchmarks • However, what use are FLOPS to • a business person using a spreadsheet? • a secretary writing letters on a word processor? • a computer scientist compiling programs in C++? CSCI1412-HW-3
Benchmarking • There is no satisfactorily agreed single method of measuring the speed of computers • actual system speed also depends on RAM speed, bus speeds, video performance, hard disk speeds, etc. • Many magazines set up standard tasks simulating general office / scientific use • e.g. Excel / Word running under Windows Vista • these may provide a good comparison of systems, but may only be applicable to one type of computer (Windows PC) for a short amount of time • what happens when Windows Vista becomes obsolete!? CSCI1412-HW-3
Caching • Intermediate storage - uses high-speed SRAM • Holds recently accessed instructions/data • high probability that these will be re-used • Different types of cache: • primary cache (Level 1) - in the processor • 8Kb - 32 Kb • fastest type of cache • secondary (Level 2) – also now in the processor • 512Kb - 1Mb • (used to be called cache-on-a-stick - COAST) • disk cache (Level 3) - section of RAM • specified by the user (or automatically by operating system) CSCI1412-HW-3
Pipelining • Technique used to increase processing speed • Processor begins to execute a second instruction before first has been completed • Therefore several instructions are in the pipeline • up to six instructions in the Pentium • The pipeline is divided into segments • segments are processed concurrently • Also used in RAM to preload the next requested memory content CSCI1412-HW-3
Parallelism • Intel Pentium processors have a form of parallelism called: • single instruction multiple data (SIMD) • The same instruction is run on multiple data at the same time • improves the speed at which sets of data requiring the same operation can be processed • most of these extensions are for floating-point ops. • Typically used for complex co-ordinate transforms • found in e.g. 3-D games graphics when a picture is being updated to form the next frame in a motion CSCI1412-HW-3
RAM • Random Access Memory • Volatile memory which loses it’s data when the power is switched off. • Two main types: • SRAM. Static RAM • DRAM. Dynamic RAM CSCI1412-HW-3
SRAM and DRAM Differences between static and dynamic RAM: • Dynamic RAM must be refreshed or it will lose its data • Static RAM only needs current to be applied – bits do not need to be refreshed. • Both SRAM and DRAM are volatile. • Most modern computers use some form of DRAM for the main memory. CSCI1412-HW-3
SRAM • Used in small amounts in computers where very fast RAM is required, such as in the cache of many CPU's. • DRAM is much less expensive than SRAM, but is usually slower and must constantly be refreshed in order to preserve its contents. Types of SRAM include: • Asynchronous Static RAM • Synchronous Burst Static RAM • Pipeline Burst Static RAM CSCI1412-HW-3
DRAM • DRAM – each data bit is stored in a separate capacitor. The benefit of this is the avoidance of corruption. • Dynamic because it requires refreshing data integrity. Types of DRAM include: • SDRAM Synchronous Dynamic Random Access Memory • DDR SDRAM Double Data Rate SDRAM CSCI1412-HW-3
SDRAM • SDRAM - Synchronous Dynamic Random Access Memory. • Dynamic because it requires refreshing data integrity. • Synchronous because it lines itself up with the computer system bus and processor. The computer's internal clock drives the entire mechanism. • Can accept > 1 write command at a time - Pipelining. CSCI1412-HW-3
DDR SDRAM DDR SDRAM (Double Data Rate Synchronous Dynamic Random Access Memory) • Achieves nearly twice the bandwidth of single data rate SDRAM by double pumping (transferring data on the rising and falling edges of the clock signal) without increasing the clock frequency. CSCI1412-HW-3
DDR2 and DDR3 • DDR2 and DDR3 • An evolution of DDR, with higher internal bus speeds. • DDR2 bus runs at twice the speed of DDR memory. • DDR3 at even higher speeds. • Most modern computers use DDR, DDR2 or DDR3 packaged in DIMMs (Dual In-line memory Modules) – electrical contacts plug directly into the main board. • DIMMS have a 64 bit data bus (as do Pentium processors) • SIMMS (now obsolete)have a 32 bit bus CSCI1412-HW-3
Mainboard Layout • Intel D945GNT • Dual-channel DDR2 667 / 533 / 400 memory support • PCI Express* x16 graphics connector • Two PCI Express* x1 connectors • Four Serial ATA ports (3.0 Gb/s) • Integrated Intel® PRO 10/100 Network Connection • Intel® High Definition Audio with 5.1 Surround Sound • Eight Hi-Speed USB 2.0 ports • Intel® Precision Cooling Technology • 1 CSCI1412-HW-3
Mainboard Layout A Auxiliary fan connector (optional) B Speaker C PCI Express x1 bus add-in card connectors [2] D Audio codec E Front panel audio connector F Ethernet device G PCI Conventional bus add-in card connectors [2] H PCI Express x16 bus add-in card connector I Back panel connectors J +12V power connector (ATX12V) K Rear chassis fan connector L LGA775 processor socket M Intel 82945G GMCH N Processor fan connector O DIMM Channel A sockets [2] P DIMM Channel B sockets [2] connector DD Intel 82801G I/O Controller Hub (ICH7) EE SPI flash device FF IEEE-1394a controller (optional) GG Front panel IEEE-1394a connectors (optional) [2] HH PCI Conventional bus add-in card connectors Q SCSI LED connector (optional) R Legacy I/O controller S Power connector T Diskette drive connector U Parallel ATE IDE connector V Battery W Front chassis fan connector X BIOS Setup configuration jumper block Y Serial ATA connectors [4] Z Auxiliary front panel power LED connector AA Front panel connector BB Front panel USB connectors [2] CC Chassis intrusion CSCI1412-HW-3
Motherboard in Situ Cooling can be a problem.... CSCI1412-HW-3
Summary • How it works! • the fetch / execute cycle in detail • Measuring speed • system clock, GHz, MIPS and FLOPS • Advanced concepts • cache, pipelining, parallelism • memory issues • dynamic and static RAM, • SIMMS and DIMMS • motherboards • component layout CSCI1412-HW-3