340 likes | 495 Views
Programming Interest Group http://www.comp.hkbu.edu.hk/~chxw/pig/index.htm. Tutorial Two Data Structures. Data Structures. Basic data types: Integral: integer, character, boolean Floating-point types: float, double, long double
E N D
Programming Interest Grouphttp://www.comp.hkbu.edu.hk/~chxw/pig/index.htm Tutorial Two Data Structures
Data Structures • Basic data types: • Integral: integer, character, boolean • Floating-point types: float, double, long double • Data structures are methods of organizing large amounts of data. • Array • List, Stack, Queue, Dequeue • Trees: binary tree, binary search tree, AVL tree • Priority Queues • Hash table • Set • Graph • COMP1200: Data Structures and Algorithms
Elementary Data Structures • Data type is a set of values and a collection of operations on those values • Basic data types in C and C++ • Integers (ints) • short int, int, long int, • Floating-point numbers (floats) • float, double • Characters (chars) • char • Structure in C and C++
Example 1: Basic Data Types #include <iostream> #include <stdlib.h> #include <math.h> using namespace std; typedef int Number; Number randNum() { return rand(); } int main(int argc, char *argv[]) { int N = atoi(argv[1]); float m1 = 0.0, m2 = 0.0; for (int i = 0; i < N; i++) { Number x = randNum(); m1 += ((float)x) / N; m2 += ((float)x*x) / N; } cout << "RAND_MAX.: " << RAND_MAX << endl; cout << "Avg.:" << m1 << endl; cout << "Std. dev.: " << sqrt(m2 - m1 * m1) << endl; } This program computers the average and standard deviation of a sequence of integers generated by the library function rand( ). Question: how can you modity the program to handle a sequence of random floating-point numbers in the range of [0, 1]?
Example 2: Structure /* return the distance between two points */ float mydistance(mypoint a, mypoint b) { float dx = a.x - b.x; float dy = a.y - b.y; return sqrt(dx*dx + dy*dy); } /* convert from Cartesian to polar coordinates */ mypolar (mypoint p, float *r, float *theta) { *r = sqrt(p.x*p.x + p.y*p.y); *theta = atan2(p.y, p.x); } #include <iostream> #include <stdlib.h> #include <math.h> using namespace std; struct mypoint { float x; float y; }; float mydistance(mypoint, mypoint); mypolar (mypoint, float *r, float *theta); int main(int argc, char *argv[]) { struct mypoint a, b; a.x = 1.0; a.y = 1.0; b.x = 4.0; b.y = 5.0; cout << " Distance is " << mydistance(a, b); float r, theta; mypolar(a, &r, &theta); cout << "r : " << r << endl; cout << “theta: " << theta << endl; } Result: [chxw@csr40 cplus]$ ./a.out Distance is 5 r : 1.41421 theta: 0.785398
Arrays • Array is the most fundamental data structure • An array is a fixed collection of same-type data that are stored contiguously and are accessible by an index • It is the responsibility of the programmer to use indices that are nonnegative and smaller than the array size • Two ways to create an array • Static allocation: size known to and set by the programmer • Dynamic allocation: size unknown to the programmer and set by the user at the execution time
Example: Sieve of Eratosthenes #include <iostream> using namespace std; static const int N = 1000; int main( ) { int i, a[N]; /* initialization */ for (i = 2; i < N; i++) a[i] = 1; for (i = 2; i < N; i++) if (a[i] ) /* sieve i’s multiples up to N-1*/ for(int j = i; j*i < N; j++) a[i*j] = 0; for (i = 2; i < N; i++) if (a[i]) cout << " " << i; cout << endl; } Sieve of Eratosthenes is a classical method to calculate the table of prime numbers. Basic idea: Set a[i] to 1 if i is prime, and 0 if i is not a prime.
Dynamic Memory Allocation • C language • malloc( ) and free( ) • C++ language • use operator new and operator delete int main(int argc, char *argv[]) { int N = atoi(argv[1]); int *a = new int[N]; if (a == 0) { cout << “out of memory " << endl; return 0; } … delete [] a; }
Array of Structures /* return the distance between two points */ float mydistance(mypoint a, mypoint b) { float dx = a.x - b.x; float dy = a.y - b.y; return sqrt(dx*dx + dy*dy); } /* return a random number between 0 and 1 */ float randfloat( ) { return 1.0 * rand() / RAND_MAX; } #include <iostream> #include <stdlib.h> #include <math.h> using namespace std; struct mypoint { float x; float y; }; float mydistance(mypoint, mypoint); float randfloat( ); int main(int argc, char *argv[]) { float d = atof(argv[2]); int i, cnt = 0, N = atoi(argv[1]); mypoint *a = new mypoint[N]; for( i = 0; i < N; i++) { a[i].x = randfloat(); a[i].y = randfloat(); } for( i = 0; i < N; i++) for(int j = i+1; j < N; j++) if (mydistance(a[i], a[j]) < d) cnt++; cout << cnt << " pairs within " << d << endl; delete [] a; } This program calculates the number of pair of points whose distance is shorter than a threshold.
List • A general list of elements: A1, A2, …, AN, associated with a set of operations: • Insert: add an element • Delete: remove an element • Find: find the position of an element (search) • FindKth: find the kth element • Each element has a fixed position • Two different implementations: • Array-based list • Linked list
List Linked list: Linked list with a header: Doubly linked list:
Sample C Implementation of Linked List with a Header • Header files: • http://www.comp.hkbu.edu.hk/~chxw/pig/code/fatal.h • http://www.comp.hkbu.edu.hk/~chxw/pig/code/list.h • Source file: • http://www.comp.hkbu.edu.hk/~chxw/pig/code/list.h
Circular List Example • Josephus problem: N people decided to elect a leader as follows: • Arrange themselves in a circle • Eliminate every Mth person around the circle • The last remaining person will be the leader
Simulation of Josephus problem int main(int argc, char *argv[]) { int i, N = atoi(argv[1]), M = atoi(argv[2]); /* create the first node */ mylink t = new mynode(1, 0); t->next = t; mylink x = t; /* insert the next N-1 nodes */ for( i = 2; i <= N; i++) x = (x->next = new mynode(i, t)); /* simulate the election process */ while (x != x->next) { for (i = 1; i < M; i++) x = x->next; /* delete the next node */ t = x-> next; x->next = t->next; delete t; } cout << x->item << endl; } #include <iostream> #include <stdlib.h> using namespace std; struct mynode { int item; mynode* next; /* constructor */ mynode(int x, mynode* t) { item = x; next = t; } }; typedef mynode *mylink;
Stacks • A stack is a list with the restriction that insertions and deletions can be performed at the end of the list, called the top. • LIFO: last in, first out • Operations: • Push(x, s) • Pop(s) • MakeEmpty(s) • IsEmpty(s) • Top(s)
Stack Implementations • Using a linked list • http://www.comp.hkbu.edu.hk/~chxw/pig/code/stackli.h • http://www.comp.hkbu.edu.hk/~chxw/pig/code/stackli.c • Using an array • http://www.comp.hkbu.edu.hk/~chxw/pig/code/stackar.h • http://www.comp.hkbu.edu.hk/~chxw/pig/code/stackar.c • Remark: you need to define the maximum stack size when creating the stack
Queues • A Queue is a list with the restriction that insertion is done at one end, whereas deletion is done at the other end. • FIFO: first in, first out • Operations: • CreateQueue(x): create a queue with maximum size of x • Enqueue(x, q): insert an element x at the end of the list • Dequeue(q): return and remove the element at the start of the list • IsEmpty(q) and IsFull(q)
Queue Implementation • Implemented by a circular array • Need to specify the maximum size of the queue when creating the queue • One variable for the front of the queue, another one for the rear of the queue • Sample code • http://www.comp.hkbu.edu.hk/~chxw/pig/code/queue.h • http://www.comp.hkbu.edu.hk/~chxw/pig/code/queue.c
Priority Queues • A priority queue is a data structure that allows the following operations: • Insert(x, p): insert item x into priority queue p • Maximum(p): return the item with the highest priority in priority queue p • ExtractMax(p): return and remove the item with the highest priority in p • Note: • Each element contains a key which represents its priority
Sets • A set is a collection of unordered elements drawn from a given universal set U. • Operations: • Member(x, S): is an item x an element of set S? • Union(A, B) • Intersection(A, B) • Insert(x, S) • Delete(x, S)
Dictionaries • Dictionaries permit content-based retrieval. • Operations: • Insert(x, d) • Delete(x, d) • Search(k, d): return an item with key k • Note • Dictionaries can be implemented by lots of techniques, like linked list, array, tree, hashing, etc.
C++ Standard Template Library • The C++ STL provides implementations of lots of data structures • Reference: • http://www.sgi.com/tech/stl/ • http://www.cppreference.com/ • Data structures: (Containers in C++) • Sequential containers (see Workshop 7) • Vectors, Lists, Double-ended Queues • Associative containers (see Workshop 7) • Sets, Multisets, Maps, Multimaps • Container adaptors • Stacks, Queues, Priority Queues
List in C++ • List is implemented as a doubly linked list of elements • Each element in a list has its own segment of memory and refers to its predecessor and its successor • Disadvantage: Lists do not provide random access. General access to an arbitrary element takes linear time. • Hence lists don’t support the [ ] operator • Advantage: insertion or removal of an element is fast at any position • http://www.cplusplus.com/reference/stl/list/
List Example 1 // list1.cpp #include <iostream> #include <list> using namespace std; int main() { list<char> coll; for (charc = 'a'; c <= 'z'; ++c) coll.push_back(c); while (! coll.empty() ) { cout << coll.front() << ' '; coll.pop_front(); } cout << endl; return 0; } $ g++ list1.cpp $ ./a.out a b c d e f g h i j k l m n o p q r s t u v w x y z $
List Example 2 // list2.cpp #include <iostream> #include <list> using namespace std; int main() { list<char> coll; for (char c='a'; c<='z'; ++c) coll.push_back(c); list<char>::const_iterator pos; for (pos = coll.begin(); pos != coll.end(); ++pos) cout << *pos << ' '; cout << endl; } $ g++ list2.cpp $ ./a.out a b c d e f g h i j k l m n o p q r s t u v w x y z $
List Example 3 // list3.cpp #include <iostream> #include <list> using namespace std; int main() { list<char> coll; for (char c='a'; c<='z'; ++c) coll.push_back(c); list<char>::iterator pos; for (pos = coll.begin(); pos != coll.end(); ++pos) { *pos = toupper(*pos); cout << *pos << ' '; } cout << endl; }
Stack in C++ push(): insert an element pop(): remove the first element top(): access the first element size(): return the number of elements empty(): check whether the container is empty Remark: pop() will remove the first element and return nothing. So usually we need to call top() to get the first element, then call pop() to remove it. // stack.cpp #include <iostream> #include <stack> using namespace std; int main() { stack<int> s; for (int i=1; i<=10; ++i) s.push(i); while( !s.empty() ) { cout << s.top() << endl; s.pop(); } return 0; }
Queue in C++ push(): insert an element pop(): remove the first element front(): access the first element back(): access the last element size(): return the number of elements empty(): check whether the container is empty // queue.cpp #include <iostream> #include <queue> using namespace std; int main() { queue<int> s; for (int i=1; i<=10; ++i) s.push(i); while( !s.empty() ) { cout << s.front() << endl; s.pop(); } return 0; }
Queue Example II // queue2.cpp #include <iostream> #include <queue> #include <string> using namespace std; int main() { queue<string> q; q.push(“These “); q.push(“are “); q.push(“more than “); cout << q.front(); q.pop(); cout << q.front(); q.pop(); q.push(“four “); q.push(“words!“); // skip one element q.pop(); cout << q.front(); q.pop(); cout << q.front(): q.pop(); cout << “number of elements in the queue: “ << q.size() << endl; return 0; }
Priority Queue in C++ push(): insert an element pop(): remove the element with the highest priority top(): access the element with the highest priority size(): return the number of elements empty(): check whether the container is empty By default, elements are sorted by operator < in descending order, i.e., the largest element has the highest priority. // pqueue.cpp #include <iostream> #include <queue> using namespace std; int main() { priority_queue<int> s; s.push(5); s.push(4); s.push(8); s.push(9); s.push(2); s.push(7); s.push(6); s.push(3); s.push(10); while( !s.empty() ) { cout << s.top() << endl; s.pop(); } return 0; }
Different Sorting Criterion // pqueue.cpp #include <iostream> #include <queue> using namespace std; int main() { priority_queue<int, vector<int>, greater<int> > s; s.push(5); s.push(4); s.push(8); s.push(9); s.push(2); s.push(7); s.push(6); s.push(3); s.push(10); while( !s.empty() ) { cout << s.top() << endl; s.pop(); } return 0; } Three parameters when defining a priority queue: int: type of element vector<int>: the container that is used internally greater<int>: the sorting criteria (by default, it is less<>)
Java java.util package • http://java.sun.com/products/jdk • http://java.sun.com/j2se/1.4.2/docs/api/java/util/package-summary.html • Stack • Stack • Queue • ArrayList, LinkedList • Dictionaries • HashMap, hashtable • Priority Queue • TreeMap • Sets • HashSet
What to do now? • Choose your own weapon • C: write a set of data structure • C++: learn the STL • Java: learn the java.util package • Try to solve at least one exercise • If you still have time, solve more exercises.
Practice • http://acm.uva.es/p/v100/10038.html • http://acm.uva.es/p/v100/10044.html • http://acm.uva.es/p/v100/10050.html • http://acm.uva.es/p/v101/10149.html • http://acm.uva.es/p/v102/10205.html • http://acm.uva.es/p/v102/10258.html • http://acm.uva.es/p/v103/10315.html • http://acm.uva.es/p/v8/843.html