CS 106 Computing Fundamentals II Chapter 22 “ Data Types”

Herbert G. Mayer, PSU CS Status 7/10/2013 Initial content copied verbatim from CS 106 material developed by CS professors: Cynthia Brown & Robert Martin CS 106Computing Fundamentals IIChapter 22“Data Types”

Syllabus • Define Data Type • Information Content in Bits • Storing Text • Integers • Floating Point Numbers • Logical Values • Date Type • Variables

Define Data Type • A computer data type is the representation of a particular range of values • Such values can be numeric, textual, or other • Values 0 .. 255 constitute an integer subrange type • Values from ‘A’ .. ‘Z’ constitutes a character subrange • We’ll look at the most common ones, such as character, integer, floating-point, character, boolean • For each one we learn: • What is the type typically used for in a program • The potential range of values we can store • What kinds of operations we can do on it

Information Content in Bits • One bit can have two values, 0 and 1. You cannot find a way to represent more than two different values with one bit • Two bits can represent 4 values: 00, 01, 10, 11 • Three bits gives 8 values: 0 followed by each of the 4 values for 2 bits, or 1 followed by each of the four values for 2 bits • In general, adding a bit doubles the number of values. With n bits you can represent 2n values

Storing Text • One kind of data we often want to store is text • Text is made up of characters. To let computers of different kinds interact with each other, standard character sets are used • The ASCII character set (the old standard) uses 8 bits per character: American Standard Code for Information Interchange • Another common one is EBCDIC: Extended Binary Coded Decimal Interchange Code; use by IBM • 8 bits can encode 28 = 256 different characters

Text • 256 characters is enough for upper and lower case English letters, punctuation, the 10 digits, space and new line, and some odd characters like letters with umlauts • But there are many other alphabets in the world, not to mention Chinese characters! • The new standard, called Unicode, uses 16 bits per character. With 216 possible characters, there are plenty to go around, or 65,536 different values • Yet CDC supercomputers used 6-bit character set, no lower- AND uppercase letters!

String Type • VBA uses data type String for strings of text, including single characters. Example: “hello” • Each type, including String, has appropriate operations associated with it • Note that there is no specification and thus limitation on the length of a string in VBA! Very convenient for user

Operations on Types (Strings) • With the string type, most operations use built-in functions. The & string operator concatenates two strings. For example: • varA = “Hello ” --note space at the end: ‘‘ • varB = “There” • varC = varA & varB • gives varC the value • “Hello There” • We’ll look at the string functions later

Numbers • Computer hardware is usually set up to handle two distinct types of numbers, integer and floating point • There is often separate hardware for doing arithmetic on each type, and they are used in different ways • Floating point number representation is defined via specific standards, the most common being IEEE 754

Integers • God invented integers, man invented floating point numbers!  • Integers are used for counting, e.g. indexing • Excel/VBA has two integer types, called Integer and Long. Integer is 2 bytes and Long is 4 bytes • Signed integer numbers range from -32768 to 32767. Since there can be more than 32767 rows in a worksheet, it’s best to always use Long unless you know for sure your numbers are smaller

Floating Point Numbers • This is the kind of number we use for general computations, which can have a fractional part • Internally they are stored in a form that uses a mantissa and an exponent, allowing representations similar to scientific notation (3.14159 E 23) • Excel/VBA has two floating point types, Single-Precision (4 bytes) and Double-Precision (8 bytes) • Double has way more precision and we shall use it in CS 106

Operations on Types (Numbers) • Numbers have the usual arithmetic operations, with the usual precedence. So • 2 + 3 * 4 means 2 + (3 * 4), not (2 + 3) * 4 • If you want the last version, use parentheses to change the precedence. You can always use parentheses to be sure the expression means what you want it to • Most arithmetic operations use the same notation when applied to integers or floating point numbers

Mixing Integers and Floats • If you write an expression that uses both integers and floating point numbers, the result will be a floating point number • Operations like exponentiation or division also yield a floating point result when applied to two integers • Types can be converted to related, other types • Such conversions can generate unexpected results, or else requires a user with intimate understanding of all conversion rules!

Logical (Boolean) Values • Another data type is for logical values, called Boolean in honor of logician George Boole • Booleans have one of two values: True or False • Booleans in VBA Excel are two bytes – efficient? • Why two bytes for information that could be encoded in one bit? The rationale is speed, since it would take longer for a computer to extract the value of a single bit from a group and the wasted bits are cheap

Operations on Boolean • Boolean values are manipulated via operations such as: • AND, OR, XOR, and NOT • We’ll look at these closely in our module on conditionals

Date Type • Dates are their own data type in VBA/Excel • They take up 8 bytes, and can range from January 1, 0100 to December 31, 9999 • The basic display format for dates is MM/DD/YYYY, though you can choose other formats • Internally, starting with Jan. 1, 1900, dates are represented by a number which is 1 on 1/1/1900 and goes up from there

Operations on Dates • You can add and subtract numbers from dates, as long as you are using dates after January 1, 1900 . E.g.: 2/12/12 + 7 is 2/12/19 • Excel is smart enough to go into the next month or year if need be

Other Data Types • VBA has plenty of other data types, but the ones we discussed we’ll mainly use • VBA includes a Range type, to define to a range of cells. These can be useful in programming; we’ll introduce them as we need them • There is also a Variant data type that is kind of a grab-bag of all types together. It uses a lot of extra space and we will avoid it; too advanced! • However, Excel cells do automatically have the type Variant, else you could not arbitrarily assign a cell all kinds of values

Declaring Variables • When we start to use variables, we will make it a practice to declare them: state up front what their data type is • The alternative, which might seem attractive, is to let Excel deduce their data type • The problem with this is that it can make a typo in a variable name into a serious, hard to find error as Excel just thinks it is a new variable • Very common source of problems in old programming languages, such as Fortran and PL/I • We will use Option Explicit as the first line in every VBA module we write to avoid this problem

The Dim Statement • To declare a variable we use a statement such as • Dim varA As Long • Dim is a keyword that tells VBA that we are getting ready to declare a variable • varA is the name of the variable – a user defined name • You must use the special keyword As, between the variable name and the name of the type • Use Const instead of Dim to declare a constant, as in • Const varA As Long = 1 • A constant can’t be changed by the code; must be initialized

Names for Variables • We’ll sometimes use fairly generic names (like varA) in our initial examples; when you’re writing a real macro, use meaningful names for your variables • You cannot use a key word as a variable name, as keywords are reserved • The name must start with a letter and is made up of letters, digits and underscores; examples: • My_value • min3 • It can be up to 255 characters in length

Names for Variables • In our class, we will start all variable names with a lower case letter • An example of a long descriptive name would be primaryInterestRate or primary_interest_rate • VBA does not distinguish by case, so varA and vara are the same name to VBA • Again: VBA variable names are not case sensitive!!

A Final Caution • Things are not always exactly the same in VBA as in Excel. Each has their own built-in functions, precedence rules • For example: Excel cells are named by alphabetic columns followed by numeric lines, e.g. B4 --meaning column 2, row 4 • VBA Cells() indices are both numeric and list the row number first, such as Cells( 4, 5 ) --meaning row 4, column 5

CS 106 Computing Fundamentals II Chapter 22 “ Data Types”