C OMPUTER ARCHITE C T U R E ( P175B125 )

COMPUTER ARCHITECTURE(P175B125) Assoc.Prof.Stasys Maciulevičius Computer Dept. stasys.maciulevicius@ktu.lt

Virtual memory • Modern computers can simultaneously run several programs (in parallel or pseudoparallel mode) • Each such program (process) has a separate code and data area • The mechanism that provides for more processes run simultaneously, sharing the memory correctly and correct addresing of information, transforming logical addresses to physical addresses is called Virtual Memory ©S.Maciulevičius

Virtual memory • Moreover, the virtual memory ensure that the necessary for process information (program code and data) at the appropriate time would be loaded in the main memory, protects the process memory space from other processes • Virtual memory inphysical point of view - the main memory plus part of the external memory, together with the tools for address transforming and information interchanging between these levels • Virtual memory inlogicalpoint of view - extended memory space with contiguousaddressing ©S.Maciulevičius

Virtual memory • There are two main principles used in realization of virtual memory: • Segmentation -an application's virtual address space is divided into variable-length segments. A virtual address consists of a segment number and an offset within the segment. Task may have several segments – code (program), data, stack. • Paging – an application's virtual address space is divided into fixed sizedpages; a page is a block of contiguous virtual memory addresses, usually at least 4Kbytes in size. The pages do not have to be contiguous in memory ©S.Maciulevičius

Main and external memory CPU reg. Cache Swap in segment (page) Swap out segment (page) Main memory External memory ©S.Maciulevičius

Segmented virtual memory Op. system Op. system Op. system Op. system 1’st process 2’nd process 2’nd process 2’nd process 2’nd process 3’rd process 3’rd process 3’rd process 3’rd process 4’th process 4’th process 6’th process 5’th process 5’th process 5’th process 5’th process ©S.Maciulevičius

31 2423 0 Memory Segment No. Byte address 2 segm 0 segm. 1 segm. 8000 Segmenttable 24 8 Segm.No. 32-bit base 0 20000H 1 4F000H 2 8000H … …. 20000 4F000 Segmenting ©S.Maciulevičius

Program length Compa-rator Data length MUX Pagefault Stack length Program base Data base MUX Sum-mator Physical address Stack base Segment Offset Simple mechanism of segmentation ©S.Maciulevičius

15 0 31 0 Selector Effective address Descriptor table Segment descriptor + Base address 31 0 Linear (physical)address Segmentation principles inIA-32 ©S.Maciulevičius

Segmentation mechanism inIA-32 • InIA-32 architecture segmentation is supported by folowing tools: • mechanism forcalculation of physical address • segment descriptor tables: • local descriptor table(LDT) • global descriptor table(GDT) • interrupt descriptor table(IDT) • privilegy system • Each table has assigned to it the processor register, which holds: • 16-bit limit (size of table) • 32-bit base address (tablelocation in memory) ©S.Maciulevičius

Address spacesin IA-32 • IA-32 architecture has three such address spaces: • logical addressspace; logical (or virtual) address consists of two integers: a 16-bit segment selector and a 32-bit offset; space size is 214 selectors 4 GB = 64 TB • linear addressspace; linear addressappears on the output of segmentationunit, as result of logical address translation; • physical addressspace; physical addressappears on the output of pagingunit; in case when pagingis not used, physical addressequals to linear address; this address (BE7-BE0 bits and A31-A3) goas to main memory ©S.Maciulevičius

Logical page number Pageframe number Byteoffset Byteoffset Pagetable Protection bits Simple mechanism of paging ©S.Maciulevičius

Segmentation with paging in IA-32 15 0 31 0 Selector Effective address (offset) Descriptor’s index 32 14 Segmentation unit Paging unit (optional) 32 Linear address 32 31 0 Physical address ©S.Maciulevičius

Segmentation mechanism inIA-32 ©S.Maciulevičius

Linear address Directory PageNo. Byte offset Page (4 KB) 10 12 10 Pagetable Target Pagedirectory PTE PDE CR3 (PDBR) Paging in IA-32 Control register (Page Directory Phys. Base Address) ©S.Maciulevičius

Address translation • Transformation ofvirtualaddressto physicalis address translation. • The problem - the extra step – access to page table. How to speed up the memory access? • To store whole page table in processor - is unrealistic because the page table takes a lot of place - megabytes. For example, if the page size is 4 KB, a 4 GB of memory takes 4 GB / 4 KB = 1024 K pages! • To store part of page table in special cache in processor. Each entry in this cache ensure fast access even to 1000 words ©S.Maciulevičius

Logical page number Pageframe number Byteoffset Byteoffset Translation lookaside buffer (TLB) Effective address from CPU Miss TLB Page table Load TLB Hit Page not in main memory OR Pageswap withhard disc Physical addressto main memory ©S.Maciulevičius

Logical page numberByteoffset Effective address Byte TLB Tag Index Index Cache Hit Hit =? Physical address =? Hit Byte No. Dataline TLB andcache ©S.Maciulevičius

Memory management unit • A memory management unit (MMU) is a computer hardware component responsible for handling accesses to memory requested by the CPU • Its functions include: • translation of virtual addresses to physical addresses (i.e., virtual memory management) • memory protection • cache control • bus arbitration, and, in simpler computer architectures (especially 8-bit systems) • bank switching ©S.Maciulevičius

AMD K7 microarchitecture ©S.Maciulevičius

Parity checking • A parity checking refers to the use of parity bits to check that data has been writted and readed accurately • The parity bit is added to every data unit (typically byte) • The parity bit for each unit is set so that: • unit has either an odd number or • unit has an even number of set bits ©S.Maciulevičius

k = b0 b1 …  b7or k = 1  b0 b1 …  b7 0 1 0 1 0 1 1 0 k Databus n 1 1 Data AR bytes DR generating parity bits / checking k Addressbus Error n Databus k Usually k=n/8 Parity checking n Parity bits m ©S.Maciulevičius

Databus n 1 1 Data AR bytesDR Generating of Hamming code k Addressbus Hamming check k Error Correc-tion n n n m k k ECC bits Error Error-checking and correcting Hamming code: a error-correcting code, can detect up to two simultaneous bit errors, and correct single-bit errors Codelength usually is k=log2n + 1; With additional parity -k=log2n + 2 ©S.Maciulevičius

Hamming code Bit Binary Contr. Data 12 1 1 0 0 D8 11 1 0 1 1 D7 10 1 0 1 0 D6 9 1 0 0 1 D5 8 1 0 0 0 K8 7 0 1 1 1 D4 6 0 1 1 0 D3 5 0 1 0 1 D2 4 0 1 0 0 K4 3 0 0 1 1 D1 2 0 0 1 0 K2 1 0 0 0 1 K1 Hamming code detects up to two simultaneous bit errors, and corrects single-bit errors ©S.Maciulevičius

Generating ofHamming code K1 = D1  D2  D4  D5  D7 K2 = D1  D3  D4  D6  D7 K4 = D2  D3  D4  D8 K8 = D5  D6  D7  D8 Let data byte is 00111001, bit D1 – at right. Then: K1 = 1  0  1  1  0 = 1 K2 = 1  0  1  1  0 = 1 K4 = 0  0  1  0 = 1 K8 = 1  1  0  0 = 0 Data byte issaved im memory with controlbits: 001101001111 ©S.Maciulevičius

Error correction Let error occus, e.g., instead: 001101001111 we have 001101101111 Calculate Hamming code: K1 = 1  0  1  1  0 = 1 K2 = 1  1  1  1  0 = 0 K4 = 0  1  1  0 = 0 K8 = 1  1  0  0 = 0 Use sum mod2: K8 K4 K2 K1 0 1 1 1  0 0 0 1 0 1 1 0– this is number of fault bit ©S.Maciulevičius

Error correction: redundancy ©S.Maciulevičius

External memory • As long-term storage in computers are used: • hard drives • CD-ROM, CDs (optical compact discs) • DVDs • flash memory • floppy disks (outdated) • strimmers. ©S.Maciulevičius

External memory 2009-2013 ©S.Maciulevičius 29

External memory Access modes: • direct access • sequential access Parameters: • capacity • accesstime • data transfer spped • relative price ©S.Maciulevičius

First HD • IBM announced the IBM 350 storage unit as a component of the RAMAC 305 computer system on September 13, 1956 • Assembled with covers, the 350 was 60 inches long, 68 inches high and 29 inches deep • It was configured with 50 magnetic disks containing 50,000 sectors, each of which held 100 alphanumeric characters, for a capacity of 5 million characters ©S.Maciulevičius

First HD • 50 platters • 1 head ©S.Maciulevičius

First HD • Disks rotated at 1,200 rpm ; tracks (20 to the inch) were recorded at up to 100 bits per inch, and typical head-to-disk spacing was 800 microinches • The execution of a "seek" instruction positioned a read-write head to the track that contained the desired sector and selected the sector for a later read or write operation • Seek time averaged about 600 milliseconds ©S.Maciulevičius

Hard disk drive ©S.Maciulevičius

Hard disk drive • Platters vary in size and hard disk drives come in two form factors, 5.25in or 3.5in • Typically two or three or more platters are stacked on top of each other with a common spindle • Tthe head flies a fraction of a millimetre above the disk. On early hard disk drives this distance was around 0.2mm. In modern-day drives this has been reduced to 0.07mm or less • There's a read/write head for each side of each platter, mounted on arms ©S.Maciulevičius

Hard disk drive • The disk controller controls the drive's servo-motors and translates the fluctuating voltages from the head into digital data for the CPU • More often than not, the next set of data to be read is sequentially located on the disk. For this reason, hard drives contain between 256KB and 8MB of cache buffer in which to store all the information in a sector or cylinder in case it's needed. This is very effective in speeding up both throughput and access times ©S.Maciulevičius

Technical specifications • Capacity: Amount of data which can be stored on a hard drive • Transfer rate: Quantity of data which can be read or written from the disk per unit of time. It is expressed in bits per second (Mb/s) • Rotational speed: The speed at which the platters turn, expressed in rotations per minute (rpm for short). Hard drive speeds are on the order of 7200 to 15000 rpm. The faster a drive rotates, the higher its transfer rate. On the other hand, a hard drive which rotates quickly tends to be louder and heats up more easily ©S.Maciulevičius

Technical specifications • Latency (also called rotational delay): The length of time that passes between the moment when the disk finds the track and the moment it finds the data • Average access time: Average amount of time it takes the read head to find the right track and access the data. In other words, it represents the average length of time it takes the disk to provide data after having received the order to do so. It must be as short as possible • Radial density: number of tracks per inch (tpi). • Linear density: number of bits per inch (bpi) on a given track. • Surface density: ratio between the linear density and radial density (expressed in bits per square inch). ©S.Maciulevičius

Technical specifications • Cache memory: Amound of memory located on the hard drive. Cache memory is used to store the drive's most frequently-accessed data, in order to improve overall performance • Interface: the connections used by the hard drive. The main hard drive interfaces are IDE/ATA, SATA, SCSI ©S.Maciulevičius

Information on disk • The data is organised in concentric circles called "tracks" • The tracks are separated into areas called sectors, containing data (generally at least 512 octets per sector) • The term cylinder refers to all data found on the same track of different platters • The term clusters (also called allocation units) refers to minimum area that a file can take up on the hard drive ©S.Maciulevičius

Hard disk Formatted and unformatted disk capacity Capacity = Number_of_cylinders  Number_of_surfaces Number_of_sectors/cilinder  sector_size Modern diskcapacity isat least 500 GB, advanced disks even reach 4 TB ©S.Maciulevičius

Hard disk • Access time depends on the following parameters: • cylinder seek time • delay on the rotation • transfer time • Information transmission time depends on: • recording density and • disk rotational speed ©S.Maciulevičius

N S N S N S S N S N N S S N N S N S N S N S S N S N S N S N S N S N Recording methods Traditional recording method –horizontal recording: Now a new recording method is in use – vertical (perpendicular) recording. The bits are in a vertical arrangement instead of horizontal in order to take up less space. By 2010, perpendicular densities are expected to exceed 500 Gb/sq. in. ©S.Maciulevičius

Diskdensity • Disk Density is measured and is also called areal density • Now how is this density calculated? For the most part the density we measure in Bit per Inch (BPI) and track per inch (TPI) • When we multiply the TPI and BPI we get areal density • RAMAC had an areal density of 2,000 bit/in² ©S.Maciulevičius

Diskdensity - 2008 • In 2012 the highest areal density was around 625Gb/inch2. • HDD areal densities measuring data-storage capacities are projected to climb to a maximum 1800Gb/inch2 per platter by 2016, up from 744Gb/inch2 in 2011, as shown in the figure below • This means that from 2011 to 2016, the five-year compound annual growth rate (CAGR) for HDD areal densities will be equivalent to 19% • For this year, HDD areal densities are estimated to reach 780Gb/inch2 per platter, and then rise to 900Gb/inch2 next year. ©S.Maciulevičius

C OMPUTER ARCHITE C T U R E ( P175B125 )