680 likes | 815 Views
CSC 2400 Computer Systems I. Lecture 2 Representations. The Machine. Why Don’t Computers Use Base 10?. Base 10 Number Representation That’s why fingers are known as “digits” Natural representation for financial transactions Floating point number cannot exactly represent $1.20
E N D
CSC 2400Computer Systems I Lecture 2 Representations
Why Don’t Computers Use Base 10? • Base 10 Number Representation • That’s why fingers are known as “digits” • Natural representation for financial transactions • Floating point number cannot exactly represent $1.20 • Even carries through in scientific notation • 1.5213 X 104 • Implementing Electronically • Hard to store • ENIAC (First electronic computer) used 10 vacuum tubes / digit • Hard to transmit • Need high precision to encode 10 signal levels on single wire • Messy to implement digital logic functions • Addition, multiplication, etc.
How do we represent data in a computer? • At the lowest level, a computer is an electronic machine. • works by controlling the flow of electrons • Easy to recognize two conditions: • presence of a voltage – we’ll call this state “1” • absence of a voltage – we’ll call this state “0” Could base state on value of voltage, but control and detection circuits more complex. • compare turning on a light switch tomeasuring or regulating voltage
Computer is a binary digital system. • Basic unit of information is the binary digit, or bit. • Values with more than two states require multiple bits. • A collection of two bits has four possible states:00, 01, 10, 11 • A collection of three bits has eight possible states:000, 001, 010, 011, 100, 101, 110, 111 • A collection of n bits has 2n possible states. Digital system: • finite number of symbols Binary (base two) system: • has two states: 0 and 1
What kinds of data do we need to represent? • Numbers – signed, unsigned, integers, floating point,complex, rational, irrational, … • Text – characters, strings, … • Images – pixels, colors, shapes, … • Sound • Logical – true, false • Instructions • … • Data type: • representation and operations within the computer
Machine Words • Machine Has “Word Size” • Nominal size of integer-valued data • Including addresses • Most current machines are 32 bits (4 bytes) • Limits addresses to 4GB • Becoming too small for memory-intensive applications • High-end systems are 64 bits (8 bytes) • Potentially address 1.8 X 1019 bytes • Machines support multiple data formats • Fractions or multiples of word size • Always integral number of bytes
Word-Oriented Memory Organization 64-bit Words 32-bit Words Bytes Addr. • Addresses Specify Byte Locations • Address of first byte in word • Addresses of successive words differ by 4 (32-bit) or 8 (64-bit) 0000 Addr = ?? 0001 0002 0000 Addr = ?? 0003 0004 0000 Addr = ?? 0005 0006 0004 0007 0008 Addr = ?? 0009 0010 0008 Addr = ?? 0011 0008 0012 Addr = ?? 0013 0014 0012 0015
Data Representations • Sizes of C Objects (in Bytes) • C Data Type Sparc/Unix Typical 32-bit Intel IA32 • int 4 4 4 • long int 8 4 4 • char 1 1 1 • short 2 2 2 • float 4 4 4 • double 8 8 8 • long double 8 8 10/12 • char * 8 4 4 • Or any other “pointer”
Pointers and Arrays • Pointer • Address of a variable in memory • Allows us to indirectly access variables • in other words, we can talk about its addressrather than its value • Array • A list of values arranged sequentially in memory • Example: a list of telephone numbers • Expression a[4] refers to the 5th element of the array a
Address vs. Value • Sometimes we want to deal with the addressof a memory location,rather than the value it contains. • Adding a column of numbers. • R2 contains address of first location. • Read value, add to sum, andincrement R2 until all numbershave been processed. • R2 is a pointer -- it contains theaddress of data we’re interested in. address value x3107 x3100 x2819 x3101 x3100 R2 x0110 x3102 x0310 x3103 x0100 x3104 x1110 x3105 x11B1 x3106 x0019 x3107
Byte Ordering • How should bytes within multi-byte word be ordered in memory? • Conventions • Sun’s, Mac’s are “Big Endian” machines • Least significant byte has highest address • Alphas, PC’s are “Little Endian” machines • Least significant byte has lowest address
0x100 0x101 0x102 0x103 01 23 45 67 0x100 0x101 0x102 0x103 67 45 23 01 Byte Ordering Example • Big Endian • Least significant byte has highest address • Little Endian • Least significant byte has lowest address • Example • Variable x has 4-byte representation 0x01234567 • Address given by &x is 0x100 Big Endian 01 23 45 67 Little Endian 67 45 23 01
Machine-Level Code Representation • Encode Program as Sequence of Instructions • Each simple operation • Arithmetic operation • Read or write memory • Conditional branch • Instructions encoded as bytes • Alpha’s, Sun’s, Mac’s use 4 byte instructions • Reduced Instruction Set Computer (RISC) • PC’s use variable length instructions • Complex Instruction Set Computer (CISC) • Different instruction types and encodings for different machines • Most code not binary compatible • Programs are Byte Sequences Too!
Big Idea: Information is Bits + Context • Computer stores the bits • You decide how to interpret them • Example (binary) 00000000 00000000 00000000 00001101 • Decimal = 13 • Float = 1.82169E-44 Chapter 1
329 101 102 22 101 21 100 20 Unsigned Integers • Non-positional notation • could represent a number (“5”) with a string of ones (“11111”) • problems? • Weighted positional notation • like decimal numbers: “329” • “3” is worth 300, because of its position, while “9” is only worth 9 most significant least significant 3x100 + 2x10 + 9x1 = 329 1x4 + 0x2 + 1x1 = 5
Unsigned Integers (cont.) • An n-bit unsigned integer represents 2n values:from 0 to 2n-1.
Unsigned Binary Arithmetic • Base-2 addition – just like base-10! • add from right to left, propagating carry carry 10010 10010 1111 + 1001 + 1011 + 1 11011 11101 10000 10111 + 111
Signed Integers • With n bits, we have 2n distinct values. • assign about half to positive integers (1 through 2n-1)and about half to negative (- 2n-1 through -1) • that leaves two values: one for 0, and one extra • Positive integers • just like unsigned – zero in most significant bit00101 = 5 • Negative integers • sign-magnitude – set top bit to show negative, other bits are the same as unsigned10101 = -5 • one’s complement – flip every bit to represent negative11010 = -5 • in either case, MS bit indicates sign: 0=positive, 1=negative
Two’s Complement • Problems with sign-magnitude and 1’s complement • two representations of zero (+0 and –0) • arithmetic circuits are complex • How to add two sign-magnitude numbers? • e.g., try 2 + (-3) • How to add to one’s complement numbers? • e.g., try 4 + (-3) • Two’s complement representation developed to makecircuits easy for arithmetic. • for each positive number (X), assign value to its negative (-X),such that X + (-X) = 0 with “normal” addition, ignoring carry out 00101 (5)01001 (9) + 11011(-5)+(-9) 00000 (0) 00000(0)
Two’s Complement Representation • If number is positive or zero, • normal binary representation, zeroes in upper bit(s) • If number is negative, • start with positive number • flip every bit (i.e., take the one’s complement) • then add one 00101 (5)01001 (9) 11010 (1’s comp)(1’s comp) + 1+ 1 11011 (-5)(-9)
Two’s Complement Signed Integers • MS bit is sign bit – it has weight –2n-1. • Range of an n-bit number: -2n-1 through 2n-1 – 1. • The most negative number (-2n-1) has no positive counterpart.
Text: ASCII Characters • ASCII: Maps 128 characters to 7-bit code. (type man ascii) • both printable and non-printable (ESC, DEL, …) characters
Interesting Properties of ASCII Code • What is relationship between a decimal digit ('0', '1', …)and its ASCII code? • What is the difference between an upper-case letter ('A', 'B', …) and its lower-case equivalent ('a', 'b', …)? • Given two ASCII characters, how do we tell which comes first in alphabetical order? • Are 128 characters enough?(http://www.unicode.org/)
Other Data Types • Text strings • sequence of characters, terminated with NULL (0) • typically, no special hardware support • Image • array of pixels • monochrome: one bit (1/0 = black/white) • color: red, green, blue (RGB) components (e.g., 8 bits each) • other properties: transparency • hardware support: • typically none, in general-purpose processors • MMX -- multiple 8-bit operations on 32-bit word • Sound • sequence of fixed-point numbers
Pointers in C • C lets us talk about and manipulate “pointers” (addresses) as variables and in expressions. • Declaration • int *p;/* p is a pointer to an int */ • A pointer in C is always a pointer to a particular data type:int*, double*, char*, etc. • Operators • *p-- returns the value pointed to by p (“dereference”) • &z-- returns the address of variable z (“address of”)
Example • int i; • int *ptr; • i = 4; • ptr = &i; • *ptr = *ptr + 1; store the value 4 into the memory location associated with i store the address of i into the memory location associated with ptr read the contents of memoryat the address stored in ptr store the result into memory at the address stored in ptr
Converting Binary (2’s C) to Decimal • If leading bit is one, take two’s complement to get a positive number. • Add powers of 2 that have “1” in thecorresponding bit positions. • If original number was negative,add a minus sign. X = 01101000two = 26+25+23 = 64+32+8 = 104ten Assuming 8-bit 2’s complement numbers.
More Examples X = 00100111two = 25+22+21+20 = 32+4+2+1 = 39ten X = 11100110two -X = 00011010 = 24+23+21 = 16+8+2 = 26ten X = -26ten Assuming 8-bit 2’s complement numbers.
Converting Decimal to Binary (2’s C) • First Method: Division • Divide by two – remainder is least significant bit. • Keep dividing by two until answer is zero,writing remainders from right to left. • Append a zero as the MS bit;if original number negative, take two’s complement. X = 104ten104/2 = 52 r0 bit 0 52/2 = 26 r0 bit 1 26/2 = 13 r0 bit 2 13/2 = 6 r1 bit 3 6/2 = 3 r0 bit 4 3/2 = 1 r1 bit 5 X = 01101000two 1/2 = 0 r1 bit 6
Converting Decimal to Binary (2’s C) • Second Method: Subtract Powers of Two • Change to positive decimal number. • Subtract largest power of two less than or equal to number. • Put a one in the corresponding bit position. • Keep subtracting until result is zero. • Append a zero as MS bit;if original was negative, take two’s complement. X = 104ten104 - 64 = 40 bit 6 40 - 32 = 8 bit 5 8 - 8 = 0 bit 3 X = 01101000two
Hexadecimal Notation • It is often convenient to write binary (base-2) numbersas hexadecimal (base-16) numbers instead. • fewer digits -- four bits per hex digit • less error prone -- easy to corrupt long string of 1’s and 0’s
Converting from Binary to Hexadecimal • Every four bits is a hex digit. • start grouping from right-hand side 011101010001111010011010111 3 A 8 F 4 D 7 This is not a new machine representation,just a convenient way to write the number.
Addition • 2’s comp. addition is just binary addition. • assume all integers have the same number of bits • ignore carry out • for now, assume that sum fits in n-bit 2’s comp. representation 01101000(104) 11110110(-10) + 11110000(-16) + (-9) 01011000(88) (-19) Assuming 8-bit 2’s complement numbers.
Subtraction • Negate subtrahend (2nd no.) and add. • assume all integers have the same number of bits • ignore carry out • for now, assume that difference fits in n-bit 2’s comp. representation 01101000 (104)11110110 (-10) - 00010000(16) +(-9) 01011000(88) (-19) 01101000 (104) 11110110 (-10) + 11110000(-16) +(9) 01011000 (88)(-1) Assuming 8-bit 2’s complement numbers.
Sign Extension • To add two numbers, we must represent themwith the same number of bits. • If we just pad with zeroes on the left: • Instead, replicate the MS bit -- the sign bit: 4-bit8-bit 0100 (4)00000100 (still 4) 1100 (-4)00001100 (12, not -4) 4-bit8-bit 0100 (4)00000100 (still 4) 1100 (-4)11111100 (still -4)
Overflow • If operands are too big, then sum cannot be represented as an n-bit 2’s comp number. • We have overflow if: • signs of both operands are the same, and • sign of sum is different. • Another test -- easy for hardware: • carry into MS bit does not equal carry out 01000 (8)11000 (-8) + 01001(9)+ 10111(-9) 10001 (-15) 01111(+15)
Logical Operations • Operations on logical TRUE or FALSE • two states -- takes one bit to represent: TRUE=1, FALSE=0 • View n-bit number as a collection of n logical values • operation applied to each bit independently
Examples of Logical Operations • AND • useful for clearing bits • AND with zero = 0 • AND with one = no change • OR • useful for setting bits • OR with zero = no change • OR with one = 1 • NOT • unary operation -- one argument • flips every bit 11000101 AND 00001111 00000101 11000101 OR 00001111 11001111 NOT11000101 00111010
Bit-Level Operations in C • Operations &, |, ~, ^ Available in C • Apply to any “integral” data type • long, int, short, char • View arguments as bit vectors • Arguments applied bit-wise • Examples (Char data type) • ~0x41 --> 0xBE ~010000012 --> 101111102 • ~0x00 --> 0xFF ~000000002 --> 111111112 • 0x69 & 0x55 --> 0x41 011010012 & 010101012 --> 010000012 • 0x69 | 0x55 --> 0x7D 011010012 | 010101012 --> 011111012
Relations Between Operations • DeMorgan’s Laws • Express & in terms of |, and vice-versa • A & B = ~(~A | ~B) • A and B are true if and only if neither A nor B is false • A | B = ~(~A & ~B) • A or B are true if and only if A and B are not both false • Exclusive-Or using Inclusive Or • A ^ B = (~A & B) | (A & ~B) • Exactly one of A and B is true • A ^ B = (A | B) & ~(A & B) • Either A is true, or B is true, but not both
General Boolean Algebras • Operate on Bit Vectors • Operations applied bitwise • All of the Properties of Boolean Algebra Apply 01101001 & 01010101 01000001 01101001 | 01010101 01111101 01101001 ^ 01010101 00111100 ~ 01010101 10101010 01000001 01111101 00111100 10101010
Contrast: Logic Operations in C • Contrast to Logical Operators • &&, ||, ! • View 0 as “False” • Anything nonzero as “True” • Always return 0 or 1 • Early termination • Examples (char data type) • !0x41 --> 0x00 • !0x00 --> 0x01 • !!0x41 --> 0x01 • 0x69 && 0x55 --> 0x01 • 0x69 || 0x55 --> 0x01 • p && *p (avoids null pointer access)
Shift Operations • Left Shift: x << y • Shift bit-vector x left y positions • Throw away extra bits on left • Fill with 0’s on right • Right Shift: x >> y • Shift bit-vector x right y positions • Throw away extra bits on right • Logical shift • Fill with 0’s on left • Arithmetic shift • Replicate most significant bit on right • Useful with two’s complement integer representation Argument x 01100010 << 3 00010000 00010000 00010000 Log. >> 2 00011000 00011000 00011000 Arith. >> 2 00011000 00011000 00011000 Argument x 10100010 << 3 00010000 00010000 00010000 Log. >> 2 00101000 00101000 00101000 Arith. >> 2 11101000 11101000 11101000