790 likes | 1.5k Views
Data Representation. Winter 2013 COMP 2130 Intro Computer Systems Computing Science Thompson Rivers University. Course Objectives. The better knowledge of computer systems, the better programing. Course Contents. Introduction to computer systems: B&O 1
E N D
Data Representation Winter 2013 COMP 2130 Intro Computer Systems Computing Science Thompson Rivers University
Course Objectives • The better knowledge of computer systems, the better programing. Data Representation
Course Contents • Introduction to computer systems: B&O 1 • Introduction to C programming: K&R 1 – 4 • Data representations: B&O 2.1 – 2.4 • C: advanced topics: K&R 5.1 – 5.10, 6 – 7 • Introduction to IA32 (Intel Architecture 32): B&O 3.1 – 3.8, 3.13 • Compiling, linking, loading, and executing: B&O 7 (except 7.12) • Dynamic memory management – Heap: B&O 9.9.1 – 9.9.2, 9.9.4 – 9.9.5, 9.11 • Code optimization: B&O 5.1 – 5.6, 5.13 • Memory hierarchy, locality, caching: B&O 5.12, 6.1 – 6.3, 6.4.1 – 6.4.2, 6.5, 6.6.2 – 6.6.3, 6.7 • Virtual memory (if time permits): B&O 9.4 – 9.5 Data Representation
Unit Learning Objectives • Convert a decimal number to binary number. • Convert a decimal number to hexadecimal number. • Convert a binary number to decimal number. • Convert a binary number to hexadecimal number. • Convert a hexadecimal number to binary number . • Distinguish little endian byte order and big endian byte order. • Compute binary addition. • Compute binary subtraction using 2’s complement. • Determine the 2’s complement representation of a signed integer. • Understand the overflow of unsigned integers and signed integers. • Trace and fix faulty code. Data Representation
Add two binary numbers. • Compute the 1’s complement of a binary number. • Compute the 2’s complement of a binary number. • Understand the 2’s complement representation for negative integers. • Subtract a binary number by using the 2’s complement addition. • Multiply two binary numbers. • Use of left shift and right shift. • Binary division Data Representation
Unit Contents • Information Storage • Integer Representations • Integer Arithmetic • Floating Point Data Representation
1. Information Storage • Virtual memory, address, and virtual address space • Virtual address space is a conceptual image presented to the machine-level program. • Partitioned into more manageable units to store the different program objects, i.e., instructions, program data, and control information. • The actual machine level program simply treats each program object as a block of bytes, and the program itself as a sequence of bytes. • Example • int number = 28; • 4 bytes, i.e., 32 bits, will be allocated to the variable number. • The decimal number 28 will be stored in the 32 bits? How? • Binary number, not the decimal, 00000000 00000000 00000000 00011100 will be stored in the 32 bits. • Do programmers have to convert 28 to it’s binary number? Data Representation
The Decimal System • Uses 10 digits 0, 1, 2, ..., 9. • Decimal expansion: • 83 = 8×10 + 3 • 4728 = 4×103 + 7×102 + 2×101 + 8×100 • 84037 = ??? • 43.087 = ??? • Do you know addition, subtraction, multiplication and division? • 1234 + 435.78 • 1234 – 435.78 • 1234 × 435.78 • 1234 / 435.78 Number Systems
The Binary System • In computer systems, the most basic memory unit is a bit that contains 0 and 1. • The data unit of 8 bits is referred as a byte that is the basic memory unit used in main memories and hard disks. • All data are represented by using binary numbers. Data types such as text, voice, image and video have no meaning in the data representation. • 8 bits are usually used to express English alphabets. • A collection of nbits has 2npossible states. Is it true? • E.g., • How many different numbers can you express using 2 bits? • How many different numbers can you express using 4 bits? • How many different numbers can you express using 8 bits? • How many different numbers can you express using 32 bits? Number Systems
How can we store integers(i.e., positive numbers only) in a computer? • E.g., • A decimal number 329? • Is it okay to store 3 characters ‘3’, ‘2’, and ‘9’ for 329? • How are characters stored? • 32910 = ???2 Number Systems
Uses two digits 0 and 1. How to expand binary numbers? • 02 = 0×20 = 010 • 12 = 1×20 = 110 • 102 = 1×21 + 0×20 = 210 • 112 = 1×21 + 1×20 = 310 • 1002 = 1×22 + 0×21 + 0×20 = 410 • 1012 = 1×22 + 0×21 + 1×20 = 510 • 1102 = 1×22 + 1×21 + 0×20 = 610 • 1112 = 1×22 + 1×21 + 1×20 = 710 • 10002 = 1×23 + 0×22 + 0×21 + 0×20 = 810 • 10012 = 1×23 + 0×22 + 0×21 + 1×20 = 910 • ... Number Systems
Powers of 2 • 12 = 1×20 = 110 • 102 = 1×21 + 0×20 = 210 • 1002 = 1×22 + 0×21 + 0×20 = 410 • 10002 = 1×23 + 0×22 + 0×21 + 0×20 = 810 • 1 00002 = 1610 • 10 00002 = 3210 • 100 00002 = 6410 Number Systems
1000 00002 = ???10 • 1 0000 00002 = ???10 • 10 0000 00002 = ???10 • 100 0000 00002 = ???10 • 1000 0000 00002 = ???10 • 1 0000 0000 00002 = ???10 • Can you memorize the above powers of 2? • Converting to decimals • 11012 = ???10 • 1011 00102 = ???10 • 1011.00102 = ???10 Number Systems
Converting Decimal to Binary Quotient Remainder • 21 / 2 10 1 10 / 2 5 0 5 / 2 2 1 2 / 2 1 0 1 / 2 0 1 => 2110 = 1 01012 • 27110 = ???2 • 607110 = ???2 Number Systems
Another similar idea • 27110 = ???2 256 < 271 < 512 -> 271 = 256 + 15 = 1 0000 00002 + 15 8< 15 < 16 -> 15 = 8 + 7 = 10002 + 7 => 271 = 1 0000 00002 + 15 = 1 0000 00002 + 10002 + 7 = 1 0000 00002 + 10002 + 1112 = 1 0000 11112 • 127110 = ???2 Number Systems
Hexadecimal Number System • 010 = 00002 = 016 = 0x0 • 110 = 00012 = 116 = 0x1 • 210 = 00102 = 216 = 0x2 • 310 = 00112 = 316 = 0x3 • 410 = 01002 = 416 = 0x4 • 510 = 01012 = 516 = 0x5 • 610 = 01102 = 616 = 0x6 • 710 = 01112 = 716 = 0x7 • 810 = 10002 = 816 = 0x8 • 910 = 10012 = 916 = 0x9 • 1010 = 10102 = A16 = 0xA • 1110 = 10112 = B16 = 0xB • 1210 = 11002 = C16 = 0xC • 1310 = 11012 = D16 = 0xD • 1410 = 11102 = E16 = 0xE • 1510 = 11112 = F16 = 0xF • 14816= ???1014816= ???2 • 23c9d6ef = ???1023c9d6ef = ???2 4 bitscan be used for a hexadecimal number, 0, ..., F. Please memorize it! Number Systems
Converting Decimal to Hexadecimal Quotient Remainder • 328 / 16 = 20 × 16 + 8 20 / 16 = 1 × 16 + 4 1 / 16 = 0 × 16 + 1 => 32810 = 14816 = ???2 • 14816 = (1 × 162 + 4 × 161 + 8 × 160)10 • 19210 = ???16 Number Systems
Converting Binary to Hexadecimal • 4DA916= ???2 • 1001101101010012 = ???16 = 100 1101 1010 1001 = 4DA9 • 10 11102 = 0x??? = ???10 • 0100 1110 1011 1001 01002 = 0x??? = ???10 Number Systems
Format specifiers in printf() for hexadecimal and decimal numbers? for (i = 0; i < num; i++) printf(“%d = 0x%x\n”, data[i], data[i]); • But there is no printf format specifier to print an integer in the binary form. • Write a function to make a string of 0’s and 1’s for a given integer. Data Representation
Words • A word size – the nominal size of integer and pointer data. A pointer variable contains an address in the virtual address space. We will discuss pointer variable in the next unit. • The word size determines the maximum size of the virtual address space. • 32bit operating systems? • 64bit operating systems? Data Representation
Data Sizes • printf(“%lu\n”, sizeof(long)); //lu: long unsigned // integer Data Representation
Addressing and Byte Ordering • A variable x of type int • Address of x: 0x100 • This means the 4 bytes of x would be stored in memory locations 0x100, 0x101, 0x102, and 0x103. • How to interpret the bytes in memory locations 0x100, 0x101, 0x102, and 0x103? • Let’s assume x has 0x1234567. There are two conventions to store the values in the 4 consecutive byte memory locations. 0x01, 0x23, 0x45, and 0x67, or 0x67, 0x45, 0x23, and 0x01, depending on CPU architecture. • Little endian byte order – Intel-compatible machines 0x103 0x102 0x101 0x100 address 0x01 0x23 0x45 0x67 value • Big endian byte order – machines from IBM and Sun Microsystems 0x103 0x102 0x101 0x100 0x67 0x45 0x23 0x01 Data Representation
The byte orderings are totally invisible for most application programmers. • Why are the byte orderings important? • Think of data exchange between two machines through a network. • Assembly programming • When type casting is used Data Representation
Representing Strings • A string is encoded by an array of characters terminated by the null (having 0) character ‘\0’; • The ASCII character set • Unicode – Some libraries are available for C. Data Representation
Representing Code • Different machine types use different and incompatible instructions and encodings. • Even identical processors running different OSes have differences in their coding conventions and hence are no binary compatible. Data Representation
Comparison and Logical Operations • Comparison operators ??? • Logical operators ??? Data Representation
Bit-Level Operations • ??? • &, |, ~, ^, >>, << • >>: Logical right shift and arithmetic right shift. • Logical right shit for unsigned integers: filled with 0 • Arithmetic right shit for signed integers: filled with the MSB • Examples char x = -128; unsigned char y = 128; x = x >> 1; y = y >> 1; printf (“%d, %d\n”, x, y); • We will discuss about this example later again. • Some examples a ^ a = ??? a ^ 0 = ??? x = 10; y = 20; y = x ^ y; x = x ^ y; y = x ^ y; Data Representation
Some more examples 11110000 & 11001100 = ??? 11110000 | 11001100 = ??? 11110000 ^ 11001100 = ??? 0x3B & 0x33 = ??? 0x3B | 0x33 = ??? 0x3B ^ 0x33 = ??? 0x3B >> 2 = ??? 0x33 << 2 = ??? • Consult with programming assignments Data Representation
2. Integer Representations C Java Size char, unsigned char byte 1B short, unsigned short short 2Bs char in Java uses 2Bs. int, unsigned int int 4Bs long, unsigned long long 8Bs // there is no unsigned in Java float float 4Bs double double 8Bs Data Representation
Unsigned Encodings • unsigned char All 8 bits are used. No sign bit. • The smallest number is 0 • The maximum number is 0xff. • unsigned short 16 bits • The smallest number is ??? • The maximum number is ??? • unsigned int 32 bits • The smallest number is ??? • The maximum number is ??? • unsigned long 64 bits • The smallest number is ??? • The maximum number is ??? Data Representations
Representation of Unsigned Integers • 8-bit representation of unsigned char 255 11111111 254 11111110 ... ... 128 10000000 127 01111111 126 01111110 ... ... 2 00000010 1 00000001 0 00000000 • The maximum number is ? • The minimum number is ? • What if we add the maximum number by 1 ??? • What if we subtract the minimum number by 1 ??? +1 +1 +1 overflow Data Representations
unsigned char x, y; x = 128; y = 128; printf(“x = %d, y = %d\n”, x, y); printf(“x + y = %d\n”, x + y); // int (not char) addition x = x + y; // 256 -> 100000000 -> truncation printf(“x = %d, y = %d\n”, x, y); X = 128; x = x >> 1; // logical right shift for unsigned printf(“x = %d\n”, x); • The output is ??? • 128, 128 256 0, 128 64 • 16-bit, 32-bit, 64-bit representations have the same overflowproblem. • How to represent signed integers? Data Representation
Binary Addition • We will discuss binary addition and binary subtraction, before we discuss the representation of signed integers. • How to add two binary numbers? Let’s consider only unsigned integers(i.e., positive numbers only) for a while. • Just like the addition of two decimal numbers. • E.g., 10010 10010 1111 + 1001 + 1011 + 1 11011 11101 ??? 10111 + 111 ??? carry Data Representations
Binary Subtraction • How to subtract a binary number? • Just like the subtraction of decimal numbers. • E.g., 0112 02 02 1000 10 10 10010 10010 10010 -1-1 -11 -11 -11 1 ?1 ?11 1111 Try: 101010 How to do? 1 -101-10 Data Representations
In the previous slide, 10010 – 11 = 1111 • What if we add 00010010 + 11111100 1 00001110 + 1 00001111 • Is there any relationship between 112 and 111002? • The 1’s complement of 112 is ??? Switching 0 1 • This type of addition is called 1’s complement addition. • Find the 8-bit one’s complements of the followings. • 11011 -> 00011011 -> • 10 -> 00000010 -> • 101 -> 00000101 -> Data Representations
In the previous slide, 10010 – 11 = 1111 • What if we add 00010010 + 11111101 1 00001111 • Is there any relationship between 11 and 11101? • The 2’s complement of 11 is ??? • 2’s complement ≡ 1’s complement + 1 -> 11100 + 1 = 11101 • This type of addition is called 2’s complement addition. • Find the 16-bit two’s complements of the followings. • 11011 -> 0000000000011011 -> • 10 • 101 Data Representations
Another example 101010 - 101 ??? • What if we use 1’s complement addition or 2’s complement addition instead as follow? Let’s use 8-bit representation. 00101010 00101010 + 11111010+ 11111011 1 00100100 1 00100101 + 1 00100101 • What does this mean? • A – B = A + (–B), where A and B are positive • Is the 1’s complement or the 2’s complement of B sort of equal to –B? Data Representations
Can we use 8-bit 1’s complement addition for 12 – 102 = –12? 1 00000001 - 10+ 11111101<- 8-bit 1’s complement of 10 11111110 <- Is this correct? (1’s complement of 1?) • Let’s use 8-bit 2’s complement addition for 12 – 102. 00000001 +11111110<- 2’s complement of 10 11111111 <- Correct? (2’s complement of 1?) • 12 – 102 = 12 + (–102) = –12 • How to represent negative binary numbers, i.e., signed integers? Data Representations
Representation of Negative Binaries • Representation of signed integers • 8 or 16 or 32 bits are usually used for integers. • Let’s use 8 bits for examples. • The left most bit (called most significant bit) is used as sign. • When the MSB is 0, non-negative integers. • When the MSB is 1, negative integers. • The other 7 bits are used for integers. • How to represent positive integer 9? • 00001001 • How about -9? • 10001001 is really okay? • 00001001 (9) + 10001001 (-9) = 10010010 (-18) It is wrong! • We need a different representation for negative integers. Data Representations
How about -9? • 10001001 is really okay? • 00001001 (9) + 10001001 (-9) = 10010010 (-18) It is wrong! • We need a different representation for negative integers. • What is the 8-bit 1’s complement of 9? • 11110110 <- 8-bit 1’s complement of 9 • 00001001 + 11110110 <- 9 + 8-bit 1’s complement of 9 = 11111111 <- Is it zero? (1’s complement of 0?) • What is the 2’s complement of 9? • 11110111 <- 8-bit 2’s complement of 9 • 00001001 + 11110111 <- 9 + 8-bit 2’s complement of 9 = 1 00000000 <- It looks like zero. • 2’s complement representation is used for negative integers. Data Representations
12 – 102 = 12 + (–102) ??? What is the result in decimal? 00000001 + 11111110<- 2’s complement of 10, i.e., -102 11111111 <- 2’s complement of 1, i.e., -1 (= 1 – 2) • 1010102 – 1012 = 1010102 + (–1012) ??? • 100102 – 112 ??? • 102 – 12 ??? • -102 – 12 ??? • Is the two’s complement of the two’s complement of an integer the same integer? • What is x when the 8-bit 2’s complement of x is • 11111111 11110011 10000001 Data Representations
8-bit representation of signed char with 2’s complement 127 01111111 126 01111110 ... ... 2 00000010 1 00000001 0 00000000 -1 11111111 -2 11111110 -3 11111101 ... ... -127 10000001 -128 10000000 • The maximum number is ? • The minimum number is ? • What if we add the maximum number by 1 ??? • What if we subtract the minimum number by 1 ??? overflow +1 +1 -1 overflow +1 -1 -1 Data Representations
16-bit representation signed short with 2’s complement ... 01111111 11111111 ... 01111111 11111110 ... ... 3 00000000 00000011 2 00000000 00000010 1 00000000 00000001 0 00000000 00000000 -1 11111111 11111111 -2 11111111 11111110 -3 11111111 11111101 ... ... ... 10000000 00000001 ... 10000000 00000000 • The maximum number is ? What if we add the maximum number by 1 ??? • The minimum number is ? What if we subtract the minimum number by 1 ??? overflow +1 +1 +1 -1 -1 -1 overflow Data Representations
Note that computers use the 8-bit representation, the 16-bit representation, the 32-bit representation and the 64-bit representation with 2’complement for negative integers. • In programming languages • char, unsigned char 8-bits • short, unsigned short 16-bits • int, unsigned int 32-bits • long, unsigned long 64-bits • When we use the 32-bit representation with 2’s complement, • The maximum number is ? • What if we add the maximum number by 1 ??? • The minimum number is ? • What if we subtract the minimum number by 1 ??? Data Representations
Now we know how to represent negative integers. • 2’ complement addtion A + (–B) is computed for subtraction A – B. • Let’s suppose B is negative. Then –B is really a positive integer? For example, let’s consider 1 byte signed integer. 127 01111111 126 01111110 ... ... 2 00000010 1 00000001 0 00000000 -1 11111111 -2 11111110 -3 11111101 2’s complement of -3 is 00000011, i.e., 3. ... ... -127 10000001 -128 10000000 2’s complement of -128 is 10000000 again. • For any -127 < x < 127, x – x = 0. But (-128) – (-128) = ??? Data Representation
char x, y; x = 128; // 128 -> 10000000 This is -128. y = 128; printf(“x = %d, y = %d\n”, x, y); printf(“x – y = %d\n”, x - y); x = x – y; printf(“x = %d, y = %d\n”, x, y); x = 128; x = x >> 1; // arithmetic shift for signed printf(“x = %d\n”, x); • The output is ??? • -128, -128 0 0, -128 -64 • -128 – (-128) = -128 + (-(-128)) = -128 + (-128) = 10000000 + 10000000 = 00000000 = 0 Data Representation
char a = 127, b = 127, c; unsigned char d; c = a + b; d = a + b; printf(“a=%d, b=%d, c=%d, d=%d\n”, a, b, c, d); • The output is ??? • a=127, b=127, c=-2, d=254 01111111 + 01111111 = 11111110 ??? • We have to be very careful with overflow for integer values. Data Representation
Advice on Signed vs. Unsigned • Practice Problem 2.25 float sum_elements (float a[], unsigned int length) { int i; float result = 0; for (i = 0; i <= length-1; i++) result += a[i]; return result; } • When run with length equal to 0, this code should return 0.0. Instead it encounters a memory error. Why??? • unsigned int length length-1 • How to fix this code? Data Representations
short x = -5; unsigned short y = 128; printf(“%x, %x\n”, x, y); printf(“%x, %x, %x, %x\n”, x<<3, x>>3, y<<3, y>>3); The output is ??? fffffffb, 80 ffffffd8, ffffffff, 400, 10 • Singed integers: • Arithmetic right shift: Left part is filled with the most significant bit of the original value. • Unsigned integers: • Logical right shift: Left part is filled with 0. Data Representations
Practice Problem 2.26 size_t strlen(const char* s); // defined in string.h int strlonger(char s[], char t[]) { return strlen(s) – strlen(t) > 0; } Note that size_t is defined in stdio.h to be unsigned int. • For what cases will this function produce an incorrect result? • 0 – 1 > 0 -> 1 (true value) • Explain how this incorrect result comes about. • Unsigned int of 0 – 1 is 0xffffffff that is greater than 0. • Show how to fix the code so that it will work reliably. Data Representations