760 likes | 807 Views
CS3101-2 Programming Languages – C++ Lecture 1. Matthew P. Johnson Columbia University Fall 2003. Agenda. hw0 due last night Arrays, mult-D arrays Char arrays, String objects Pointers & References hw1 assigned tonight. Array review. Arrays are complex (=non-primitive) data structures
E N D
CS3101-2Programming Languages – C++Lecture 1 Matthew P. Johnson Columbia University Fall 2003
Agenda • hw0 due last night • Arrays, mult-D arrays • Char arrays, String objects • Pointers & References • hw1 assigned tonight CS3101-2, Lecture 2
Array review • Arrays are complex (=non-primitive) data structures • Ordered, fixed-length seq of vars • All elms same type • Access by index: ar[4] • Indices begin at 0 CS3101-2, Lecture 2
Arrays • Review: • Composite: multiple members • Homogeneous: mems of same type • Contiguous: adjacent in memory • Etym: root is contig-, ~ “contact” • Example: • int nums[3]; nums[0] = nums[1] = nums[2]; • nums[3] ? CS3101-2, Lecture 2
C-style strings • C also has • neither a built-in string type • Nor a string class • Just char[]s • For example: • char s[3]; s[0] = ‘H’; s[1] = ‘i’; s[2] = ‘\0’; • NB: ‘\0’ != ‘0’ • Shortcut syntax: • char s[] = “Hi”; //size inferred • For non-dynamic C-style strings • members (chars) are mutable • char[]s (arrays) are immutable CS3101-2, Lecture 2
C-style strings • So can’t: • char s1[100] = “hi”; • char s2 = “ there”; • s1 = s2; • s1 += s2; • s3 = s1 + s2; • But can: • strcpy(s1, s2); s1 == “ there” • strcat(s1, s2); s1 == “hi there” • strcat overwrites ‘\0’ of s1 • strcpy(s3, s1); s3 == “hi there” • Unsafe, though: • No space checking • Also, can get length: • strlen(s3) == 9 • And compare: • strcmp(s1, s2) < 0 s1 comes before s2 > 0 s1 comes after s2 == 0 s1 == s2 • These functions live in <cstring> • NB: <lib.h> in C <clib> in C++ CS3101-2, Lecture 2
C-style strings – strlen • Nothing built-in to a string/array telling its length, like s.length() in Java • How to find out? • For arrays in general: remember/be told • For strings: look and see! • i.e., look for ‘\0’ • for (int i = 0; s[i] != ‘\0’; i++) ; return i; • NB for C-style strings: • Size is fixed • Length (==distance from start to ‘\0’) varies CS3101-2, Lecture 2
Strings – strlen • char s[] = “hello”; • strlen(s) == 5 • s[3] = ‘\0’; • strlen(s) == … • 2 CS3101-2, Lecture 2
Mistakes with strings • char s[2]; • s = “hi”; // error: arrs are immutable • char s[2] = “hi”; //error: sizeof “hi” == 3 • char s[3] = “hi”; //okay • char t[6] = “there”; • s = s + t; // error: trying to add arrays • strcat(s, t); // error: writes too far • strcat(s, t, sizeof(s)); // s == {‘t’,’h’,’e’} • s[2] = ‘\0’; // okay • strcat(s, t, sizeof(s)-1) // okay if s[2]==‘\0’ CS3101-2, Lecture 2
C-style strings • strcat copies t over null char of s • strlen(strcat(s,t)) == strlen(s) + strlen(t) • #bytes(strcat(s,t)) == strlen(s) + strlen(t) + 1 == #bytes(s) + #bytes(t) -1 • Not automatically initialized (like arrays in gen.) • char s[100]; s[10] == ? • But extra space in static init set to ‘\0’: • char s[100] = “hi”; {‘h’, ‘i’, 0, 0, …} • char s[100] = “”; {0, 0, 0, …} • char s[100] = {}; {0, 0, 0, …} CS3101-2, Lecture 2
String class • Like C, Java, C++ has no built-in String type • I.e., String is not part of the C++ language • Like C, char[]s can be used as (“C-style”) strings • Like Java, the std libraries have a string class • defined in <string> • Lives in std namespace CS3101-2, Lecture 2
String class e.g. #include <string> #include <iostream> using namespace std; int main() { //String instances declared like other vars: string school; //Can be set to string literals: school = “Columbia”; //Can be concatenated with + op: school = school + “ “ + “University”; //school += “ “ + “University”; cout << “We\’re at: “ << school << endl; } $ g++ school.cpp $ a.out We're at: Columbia University CS3101-2, Lecture 2
String operators • NB: + is overloaded to apply to string objects • += is overloaded too: • school += “ “; school += “University”; • same effect • What if school += “ “ + “University”? • Idea 1: Precedence problem? • A: No. Assign ops have very low prec. • Idea 2 str-lit + str-lit problem? • A: Yes! + is overloaded for string objects not for string literals/char[]s/char*s • g++ says: • school.cpp:16: invalid operands `const char[2]' and `const char[11]' to binary `operator +' CS3101-2, Lecture 2
String objects • String objects != char arrays • char[]s often cannot be treated like string objs • But: string class designed so can often treat like char[]s • char[] chars accessed quickly/unsafely with []s: • char ca[] = “hi”; • ca[0] = ‘H’; • ca[-1] = … trouble • string chars can be accessed the same way: • string school = “Columbia”; • school[0] = ‘N’l • school[-1] = ‘Y’; still unsafe! • the [] operator is overloaded! CS3101-2, Lecture 2
string class • should be sure/check that the index is valid • assert(i >= 0 && i < 8); • Lives in cassert • If expr passed is false, program quits, with message: • ca.cpp:13: failed assertion `-1 >= 0 && -1 < 8‘ Abort (core dumped) • string chars can also be read by at member ftn: • school.at(0); • school.at(-1); throws an exception • does automatic bounds checking • informs caller if index is not valid • if main ftn ignores, we Abort CS3101-2, Lecture 2
string.length() • Remember: arrays do not “know” their length • where they’re defined, length is part of type (int[10]) • passed as member-type ptrs (length is lost) • array itself is just sequence of members • char[]s do not know they’re type • convention: last char is ‘\0’ • string objects do know they’re length • school.length() == 8 • can return part of self: • string.substr(start, length) • school.substr(2,3) “lum” • second param is not last char (or next-after-last char, as in J) • it’s the desired length of substring • if omitted, gives rest CS3101-2, Lecture 2
Comparing strings • string class has a compare function similar to strcmp: • string hi = "hi"; • string hi2 = "hi"; • char ca[] = "hi"; • string hi3 = ca; • string there = "there"; • hi.compare(hi2) == strcmp(hi.c_str(), hi2.c_str()) == 0 • hi.compare(there) == strcmp(hi.c_str(), there.c_str() < 0 • The big improvement, though: overloaded operators • (hi == hi2) == true • (hi == there) == false • (hi < there) == true CS3101-2, Lecture 2
Input with strings – io2.cpp • Can read in primitive types: • cout << “Enter two numbers: “; //sep-ed by whitespace cin >> x >> y; cout << “They\’re product is “ << x*y << endl; • String input slightly more complicated • string s; cout << “Enter some text: “; //reads until whitespace cin >> s; //hi there cout << “You entered: “ << s << endl; //hi • String input slightly more complicated • string s, t; cout << “Enter two words: “; cin >> s >> t; //hi there cout << “You entered: “ << s << t << endl; //hithere • Alternative: • getline(cin, s); • cout << “You entered: “ << s << endl; • takes whole line until carriage return CS3101-2, Lecture 2
String input – io3.cpp • >> treats all whitespace as delimiters • getline treats only ‘\n’ as a delim (by default) • combining can be problematic • Consider: • int i; cout << "Enter i: "; cin >> i; cout << "i == " << i << endl; string s; cout << "enter s: "; getline(cin, s); cout << "s == " << s << endl; $ g++ io3.cpp $ a.out Enter i: 10 i == 10 enter s: s == Q: Why? A: cin >> idoesn’t take ‘\n’, so getline does CS3101-2, Lecture 2
String input – io3.cpp • Soln 1: read in the ‘\n’ separately, after >> and before getline: • cin.ignore(); string s; cout << "enter s: "; • But only reads in fixed number of chars (1 by default) – same problem for spaces • Soln 2: call call ignore with params: • cin.ignore(1000, ‘\n’); string s; cout << "enter s: "; • ignores up 1000 chars, up to/including ‘\n’ • Soln 3: call getline twice: • string s; getline(cin, s); cout << "enter s: "; getline(cin, s); … • NB for C++-style strings: • Length varies • Size varies automatically CS3101-2, Lecture 2
Converting bet. C-style, C++-style • C C++: • Converted in assignment • char cs[] = “hi; • string s = “hi”; • s = cs; • s = “hi”; • C++ C: • Use string’s member ftn • string s “hi”; • char cs[] = s.c_str(); • C-style strings can also be input with getline • Should pass max num chars • getline(cin, s, sizeof(s)); CS3101-2, Lecture 2
Non-pointer data types • Used several data types: • ints ~ whole numbers • floating pts ~ “decimal” (non-whole) numbers • chars ~ letters • Will (next time) use generic complex data types: • Structs/classes ~ multiple-field records/objects • In general: • a data type ~ class of poss vals, • corresponding to real-life concept (letter, number, customer record) • Indy var means/refers to/denotes today’s date., # pages, etc. • But not always CS3101-2, Lecture 2
Non-pointer vars • Vars are distinct • Changing one does not affect another: • int x = 10; int y = x; x++; y == 10 • Vars live in sep places in memory • I.e., mapping:{var-names}{vars} is injective, “one-to-one”: • Distinct var-names always refer to distinct vars • Draw picture with x, y, array CS3101-2, Lecture 2
Pointers • Do not correspond to real-life concept • Indy var means … another var • Somewhat like quotes in natural language • “Snow is white,” v. “’Snow’ has four letters.” • Corresponds to computer hardware concept, memory address • Theoretically, very interesting • self-reference • Halting Problem: interp nums as TMs • Godel’s Incompleteness Theorem: interp nums as sentences • ~ “I am Lying,” “This sentence is unprovable.” • Here: interp nums as var addresses • See 3261/4236, Godel, Escher, Bach, Douglas Hofstadter CS3101-2, Lecture 2
Pointers • Q: What is a pointer? • A: K&R: “a var that contains the address of a var.” • For regular (“first-order”) vars, we have var name, and value: • char c = 10; • Remember: var declar/def/init does: • obtains mem for data • attaches the name c to that mem location • writes 10 there CS3101-2, Lecture 2
Var creation • After declar/def/init, can use that data with the identifier c. • When we say c, it remembers where to look • The data lives somewhere in memory. • Don’t have to know where the data actually is—just remember c. • Could we find out the location? (Why?) • Turns out: yes • Ptrs are vars that take on these locations as vals CS3101-2, Lecture 2
Pointers • p = &c • & op applied to a var (lvalue) evals to its address; • p is now set to the address of c • We interp val of p as a memory location • p now “points to” c • & applies only to lvalues • Not consts, literals, const exprs • Draw picture of p & c in mem CS3101-2, Lecture 2
& and * • Exists: inverse operator to &: * • the dereferencing or indirection op. • Applied to a pointer • evals to the pointer’s referent • what it points to • int x = 1, y = 2, z[10]; /* init */ int *ip; /* declar */ ip = &x; /* ip points to x */ y = *ip; /* y set to val of ip’s referent, 0*/ *ip = 0; /* ip’s referent, x, set to 0 */ ip = &z[0]; /* ip now points to z[0] • After execution, x == 0, y == 1 CS3101-2, Lecture 2
& and * • & and * (as ptr ops) are inverses • Cancel each other out • Examples • *&x “the value at the address of x” x • Who is buried in Grant’s tomb? • &*xp “the address of the value at the address xp” xp • Where is the contents of Grant’s tomb? CS3101-2, Lecture 2
Pointer declars • To declar ptr-to-type var, similar to type declar, plus *: • int *ip; • Notice: primitives, arrays, ptrs, ftns, can all be declared together: • int x, *ip, y, z[10], myftn(); • Interp 1: ip is an int-pointer: (int*) ip • combined declar can be confusing: int* ip, x; ip is int*, x is int • Interp 2: dereferenced ip is an int: int (*ip) • more readable: int *ip, x; ip is int*, x is int CS3101-2, Lecture 2
Dereferenced ptrs • Dereferenced ptr gives the actual lvalue, not simply a passive value • Can be used just as the corresponding var could be • Both accessed and modified • *ip = *ip + 1; • *, &, other unaries have high precedence • *ip += 1; CS3101-2, Lecture 2
Uninit-ed ptrs • Do not dereference uninit-ed ptrs • if nec, init ptrs to NULL (==0) • int *ip = NULL; • int *ip = 0; • then check for null before deref: • if (ip != NULL) … • if (ip) … CS3101-2, Lecture 2
Dereferenced ptrs • What about ++? • Unaries associate R-to-L • ++*ip ++(*ip) • But: • *ip++ *(ip++) • Need parenths • ip++ is valid! • “pointer arithmetic” CS3101-2, Lecture 2
Modifying ptrs vals • Pointers can be modified, like other vars: • iq = ip; • Example: • int *ip, *iq, x; x = 5; ip = &x; iq = ip; *ip = 10; • x == *ip == *iq == 10 CS3101-2, Lecture 2
Uses of ptrs • Limitation mentioned before: ftns have single return value • Sometimes nice to have multiple return values • e.g., some value, plus error/success. • getchar() returns next char read, EOF (-1) if error • okay, -1 isn’t a char • consider a getint(): return next int read, or EOF (-1) if error • problem: EOF == -1 is an int! • Soln: let return value be error/success; write next int at location of a ptr param: • int getint(int *); CS3101-2, Lecture 2
Passing by ref • Remember: args passed-by-value by def. • Local var created, inited to val of arg • Actual arg never changes • Alternative method is pass-by-reference (not avail in C) • Don’t pass the value, pass the location of the value • Don’t send the webpage, send the URL • Ptrs can simulate pass-by-ref CS3101-2, Lecture 2
Passing by ref: swap • Swap two ints • Naïve swap ftn: • void swap(int x, int y) { int temp = x; x = y; y = temp; } … • int a = 5, b = 10; swap(a,b); • Effect: nothing! • x inited to val of a == 5, y inited to val of b == 10 • x and y swapped • a and b never change • NB: couldn’t be otherwise • Consider: swap(5,10); • Can’t change literals CS3101-2, Lecture 2
Passing by ref: swap’ • Soln 1: write a CPP macro, obviating real params • Soln 2: Don’t pass a and b’s vals, but ptrs to a and b • void swap(int *px, int *py) { int temp = *px; *px = *py; *py = temp; } … • int a = 5, b = 10; swap(&a,&b); • NB: px, py are passed by value! • These vals happen to be (interped as) refeences • Draw picture! • Soln 3: Arrays… CS3101-2, Lecture 2
Pointers & Arrays • What do they have to do w/each other? • In Java, not much • arrays are special sorts of objs, with many mems • arrays, like all objs, are passed by reference • two distinct ideas • In C/C++: Ptrs and arrays turn out to be almost identical CS3101-2, Lecture 2
Pointers & Arrays • Consider int a[10]; • What happens? • The id a is attached to a block of mem with room for 10 ints • These ints accessed with a and a subscript: a[0], a[5] • Important: the 10 ints are contiguous, in a row: • a: • 0 1 2 3 4 5 6 7 8 9 • a[0] gives an int • &a[0] gives the address of that int CS3101-2, Lecture 2
Pointers & Arrays • “Ordinary” ptrs can point to arrays: • pa = &a[0]; • pa: a: 0 1 2 3 4 5 6 7 8 9 • Now can get other array elms” • pa+1 points to next • pa+2 • pa++ • pa-1 points to prev CS3101-2, Lecture 2
Pointers & Arrays • Q: How does it know they’re ints? • A: pa is an int* • why ptrs need to know referent type • Turns out: array name evals to ptr to 0th elm • a == &a[0] • & not applied array name CS3101-2, Lecture 2
Pointers & Arrays • Also turns out: • a[i] (array index notation) evals to same as *(a+i) (ptr + offset notation) • array[index] is really short-cut notation • array[index] converted to *(array+index) in compilation • Also, array[index] notation applicable to regular pointers: • ip[20]*(ip+20) who knows? • As usual, be careful! • Q: What if 2[a]? • A: 2[a] *(2+a) *(a+2) (by symmetry) a[2] CS3101-2, Lecture 2
Array/ptr example • Consider a string-length ftn: • int strlen(char *s) { int n; for (n = 0; *s != ‘\0’; s++) n++; return n; } • Now change to: • int strlen(char s[]) { … } Effects? • None! CS3101-2, Lecture 2
Arrs/ptrs as ftn args • Can call: • strlen(“hi”), strlen(a), strlen(pa) • Local var s set to pt to arg passed • Same behavior: • strlen(a+2), strlen(&a[2]) CS3101-2, Lecture 2
Passing by ref: swap’’ • Soln 3: Pass an array containing a and b • void swap(int AandB[]) { int temp = AandB[0]; AandB[0] = AandB[1]; AandB[1] = temp; } … • int AandB = {5, 10}; swap(AandB); • NB: We pass one ptr (by value!), with the understanding that a and b are contiguous • Draw picture CS3101-2, Lecture 2
Arrays v. pointers • One important difference: • Defined-array values are immutable • immutable = cannot be changed • ~ “cannot mutate” • Defined-array value = id used to define/create array • Illegal: • int a[10]; • a++, a = … • Restriction does not apply to: • Array members • Array params in a ftn (“formal params”) CS3101-2, Lecture 2
Pointer arithmetic • Addition/subtraction with integers: • Ptr +/- n ptr to val distance n away • Subtraction of two pointers: • p – q distance bet. p and q • p, q should point to mems of same array/data block or NULL • No other legal ops (apart from ) CS3101-2, Lecture 2
Pointer arithmetic e.g. • int strlen(char *s) { char *p = s; while (*p != ‘\0’) p++; return p-s; } • Loops until ‘\0’ • Consider “Hi” ~ {‘H’,’i’,’\0’} • Initly, p == s ‘H’ • ‘H’ != ‘\0’ p++ /* p-s == 1 */ • ‘i’ != ‘\0’ p++ /* p-s == 2 */ • ‘\0’ == ‘\0’ return 2 CS3101-2, Lecture 2
Pointer arithmetic • Here: • p-as-number – s-as-number == p-s == 2 • For other elm types: • p-as-number – s-as-number == (p-s)*sizeof(type) • Sizeof gives # in var/type: • sizeof(int), sizeof(x) • No effect here since sizeof(char) == 1 • Show sizeof.c CS3101-2, Lecture 2