Basic data types and their representation

Basic data types and their representation CS101 2012.1

Announcements • If biometric ID does not work, write your roll number and sign your name on a piece of paper • All lab batches should be stable now • Lab this week will continue with familiarization and small programs • No tutorials this week either • But we will begin posting homework on Moodle • Will give you an impression of exam questions • Will not be graded • Will be discussed in tutorials CS101 2012.1

Layers of abstraction • Three layers • Implement fixed size primitive types by mapping possible/supported values to bit patterns • Add collection types on top of primitive types to assist writing complicated programs • Collections usually change sizes and memory layout during program execution Collection types: arrays, matrices, lists, maps, strings Primitive data types: character, integer, float, double Memory as arrayof bytes CS101 2012.1

Memory, values, variables • Unit of storage: bit (0/1) • Because such computers are easier to implement by switching transistors off and on • A byte is 8 bits wide • Values range from 00000000 to 11111111 • 28 = 256 possible bit configurations • Can be interpreted as integers from 0 to 255 (“unsigned char”) • Electronic and magnetic memory is allocated in units of bytes CS101 2012.1

Binary arithmetic • Byte value in binary: 00000000 (8 bits) • Corresponding decimal value = 0 • Written as 0dec to avoid confusion • In decimal, to increment a number, increment the unit position, unless there is overflow, in which case carry over… etc. • Same in binary • Next few values are 00000001 (=1dec), 00000010 (=2dec), 00000011 (=3dec), 00000100 (=4dec), 00000101 (=5dec) etc. CS101 2012.1

Character (char) • To a first approximation, a character is the same as a 8-bit byte • (More recently, multi-byte characters have been designed to support all the world’s languages) • The key difference is in how the byte is interpreted and processed (e.g., printed) • E.g., 97 means ‘a’, 98=‘b’, 65=‘A’, 66=‘B’ etc. • C++ lets you compare characters using the corresponding integer • Useful for sorting strings in dictionary order CS101 2012.1

Hexadecimal notation (hex) • Byte (8 bits) consists of two “nibbles” (4 bits) • Nibble ranges between 0 and 15 • Expressed in hexadecimal, 0 to 9, a to f • a=10, b=11, c=12, d=13, e=14, f=15 • So a byte is written as two hexadecimal digits, e.g. 0a or c5 • Note that 23 hex is not 23 decimal! • To make clear, written as 0x23 • printf demo CS101 2012.1

Fixed size integer types • “Short integers” (short) are 16 bits wide • 65536 possible values • Standard integers (int) are 32 bits wide • 4,294,967,296 possible values • Adequate for most purposes except governments bailing out banks and airlines • A long long int is 64 bits wide • Will sometimes call long for brevity (as in Java) • Real numbers are represented using float and double (“double precision”) … later CS101 2012.1

Two’s complement representation • Want to represent both positive and negative integers with a bit sequence (say 4 bits) • Trivial: use one bit for sign • Waste one configuration (plus and minus zero) • 0000 (0) through 0111 (7) are positive • 8 more values, so assign to 8 through 1 CS101 2012.1

The wrap-around Min=10…0 -1=1…1 0=0…0 Max=01…1 Zero is one position to the right of center CS101 2012.1

Two’s complement, continued • One sudden “wrap-around” from 7 to 8 • Works exactly the same for short, int and long int, with corresponding wrap, max, min values • Most programming systems will not detect if the wrap happens • If your program uses values near the edges, be careful in doing arithmetic and check the result! • Library packages exist to support arbitrarily large integers, not as efficient as fixed length CS101 2012.1

Real number representations • “Floating (decimal) point” • In decimal we write 0.3141011 • 0.314 is the mantissa, 11 is the exponent • Mantissa has decimal point at beginning • Same approach in computers, with radix 2 instead of 10 • In a float • 1 sign, 8 exponent, 23 mantissa bits • In a double • 1 sign, 11 exponent, 52 mantissa bits CS101 2012.1

Floating point numbers • Finite bits cannot represent all real values • Gaps between numbers that can be represented • Need care in writing expressions that combine values to avoid errors, minimize loss of precision CS101 2012.1

Some finite precision pitfalls • Some 32- and 64-bit patterns have been set aside to represent • Positive and negative infinity • Not a number or NaN (e.g. result of 0/0) • Most systems will detect overflow but not underflow • float a = 3.3e38 / 0.01;correctly results in a being “inf” • But 3.3e38 + 5 silently equals 3.3e38 (not enough bits in mantissa) CS101 2012.1

Operations on numeric types • All integers support +, , *, /, % (remainder) • Even characters support + and  • E.g., ‘a’ + 1 = ‘b’; what is ‘Z’+1? (Try it) • Float and double support +, , *, / • More complicated operations like log, exp, sine, etc. are implemented as functions • You can compare numbers using comparison operators <, <=, ==, >=, != • The result is a Boolean (0/1) value (next) • cout << (5 > 7); • cout << (4 != 3); CS101 2012.1

Boolean values and operations • In C++, int can be reused as Boolean (0 = false, anything else is true) • Binary operator && (and) • Binary operator || (or) • Short-circuit evaluation CS101 2012.1

Not and ex(clusive) or • Unary operator ! (not) • Binary operator exor is not available on single Booleans but instead on bit vectors (next) CS101 2012.1

The bool type • Old C++ used int to store Boolean values • But ANSI standard C++ does offer a type called bool • bool tval = true, fval = false; • int ival = int(tval); • However, old bad habits still allowed • if (37) { … } • bool bval = 37; • Overall value unclear CS101 2012.1

Bit array manipulation • Fixed size integers are arrays of bits • C++ lets you do bitwise Boolean algebra • a & b (and), a | b (or), a^b (exor), ~b (not) 10110110 10110110 10010101 10010101 & ^ 10010100 00100011 10110110 00100011 10010101 ~ 11011100 | 10110111 CS101 2012.1

Bit shift operations • int c = 5; cout << (c << 2); • Bits lost from the left (msb) • Zero bits inserted from the right (lsb) • Result is 20 (= 5  22) • Cheap way to multiply by powers of two 00000000,00000000,00000000,00000101 00000000,00000000,00000000,00010100 CS101 2012.1

Right shift • c >> 2 • Bits discarded to the right (lsb) • If msb of c was 0, then 0 bits injected from left (msb) • 5 >> 2 gives 1 • If msb of c was 1 (c was negative) then 1 bits injected from left • -5 >> 2 gives -2 (work it out) • 0xfffffffb >> 2 gives 0xfffffffe • Preserves sign of number CS101 2012.1

Some applications of bit operations • Is an int x odd or even? • int isOdd = (x & 1); • Remainder when divided by 8 • int remain = (x & 7); • Faster than x % 8 • How many one bits in a 32-bit int? • Repeat 32 times: • numOnes = numOnes + (x & 0x8000000); • x = x << 1; In binary this looks like a one followed by 31 zeros CS101 2012.1

Primitive variable declaration and literals • float fahrenheit; • Uninitialized, may get garbage on read • float fahrenheit = 95; • const float fahrenheit = 9.52e14; • Value will never change • Scientific notation saves typing lots of zeros • int x = 3, y = x/2; • Can initialize variables based on others already initialized CS101 2012.1

Why bother to declare • Variable names • What if you type it incorrectly later? • To initialize before any use • Types • To check all assignments to the variable • To interpret a bit sequence as intended in your program (e.g. float and int are both 32 bits) • There are languages that do not enforce variable name and type declarations • Can be lazy, but generally a Bad Idea CS101 2012.1

Type conversions • Some conversions are implicit • short x = 20000; int y = x; • int x = 40000; short y = x; • Others may result in overflow • double x = 5e40; float y = 2*x; • Some are errors • float x = (float) “hello world”; • Implicit typing • float x = 7/3; • float x = 7/3.; CS101 2012.1

Polymorphic operators and literals • 7/3 vs 7/3. • / represents division for int, float, double • Which one is invoked depends on the (inferred) type of arguments toFloat floatDiv intToFloat intDiv toInt toInt toInt toFloat `7’ ‘3’ `7’ ‘3.’ CS101 2012.1

The string data type • When we saidcout << “Hello world\n”“Hello world\n” was stored as an array of characters • Byte corresponding to H, e, …, \n, and finally a “null byte” or 00000000 (in binary) to mark the end of the string • A more modern and better way is to use the string data type • string message(“Hello world”); CS101 2012.1

Common string operations • Get the number of characters in the string • message.size() • Get the character at a specific position • message.at(5) or message[5] • Get a substring of the given string • message.substr(1, 3) • Index out of bound? • Some operations throw exceptions • Some silently truncate • Some may return garbage Calling a method on a string object CS101 2012.1

More string operations • Find the first (leftmost) or last (rightmost) occurrence of a character • message.find_first_of(‘o’) • message.find_last_of(‘e’) • Compare two strings (dictionary or lexicographic order) • msg1.compare(msg2) • Returns an integer • Negative if msg1 should appear before msg2 • Zero if msg1 and msg2 are equal • Positive if msg1 should appear after msg2 CS101 2012.1

Basic data types and their representation

Basic data types and their representation

Presentation Transcript

Internal Representation of Data in COBOL Data Types

Types of Representation

Data Types and Representation

Basic Data Types

Basic Data Types

Representation of data types

Data types and representation

Types of Representation

Logic Functions and their Representation

Basic Data Types

Data Types and Their Uses

Types and Representation

Basic Data Types

Model Types and Their Basic Component Types

2.3 Representation Strategies for Data Types

Scalar Data Types and Basic I/O

Basic Data Types

Data types and representation

Basic Data Types

Basic Data Types