1.34k likes | 1.36k Views
Chapter 8. C H A P T E R 3. Data Abstractions. J. Glenn Brookshear 蔡 文 能. J. Glenn Brookshear. Chapter 8: Data Abstractions. 8.1 Data Structure Fundamentals 8.2 Implementing Data Structures 8.3 A Short Case Study 8.4 Customized Data Types 8.5 Classes and Objects
E N D
Chapter8 C H A P T E R 3 Data Abstractions J. Glenn Brookshear 蔡 文 能 J. Glenn Brookshear
Chapter 8: Data Abstractions • 8.1 Data Structure Fundamentals • 8.2 Implementing Data Structures • 8.3 A Short Case Study • 8.4 Customized Data Types • 8.5 Classes and Objects • 8.6 Pointers in Machine Language • See the Assembly code: tcc -S file.c gcc -S file.cg++ -S file.cpp
Basic Data Structures • Homogeneous array • Row major: Pascal, C/C++, Java, C# • Column major: FORTRAN • Heterogeneous array: struct/class [, union] • Sequence • List : ArrayList, LinkedList • Stack (in Java, a Stack is a Vector) • Queue • Tree Data Structures: How to arrange data in memory
Arrays vs. Struct (including class) • Homogeneous arrays • Row-major order versus column major order • Address polynomial for each of them • Heterogeneous arrays (class, struct) • Components can be stored one after the other in a contiguous block • Components can be stored in separate locations identified by pointers • Give a formula for finding the entry in the i-throw and the j-thcolumn of a two-dimensional array if it is stored in column major order rather than row major order.
FORTRAN 用小括號 m(2,4) 在 何位址? m[1][3]在 何位址? Array allocation: Row major vs. Column major • Column major (column by column) • FORTRAN: integer*4 m(3,6) • Row major (row by row) • Pascal, C/C++,Java • int m[3][6]; • Subscript 從 0 開始 在 2010 1 2 3 4 5 6 在 2014 1 2 3 0 1 2
M(1,1) 5 6 M(2,1) 1 2 3 4 7 M(1,2) 1 2 Column-major order (FORTRAN) The array is stored as a sequence of arrays consisting of columns (column by column) Integer*4 M(2,7) M(1,2) x(2,4)
須知道每列(row)有幾欄(column)才能算出下列從何處開始安排須知道每列(row)有幾欄(column)才能算出下列從何處開始安排 • Give a formula for finding the entry in the i-throw and the j-thcolumn of a two-dimensional array if it is stored in column major order rather than row major order. Figure 8.6 A two-dimensional array with four rows and five columns stored in row major order
X[0][0] 4 5 X[0][1] 0 1 2 3 6 X[0][2] X[0][1] 0 X[1][3] 1 Row-major order: Pascal, C/C++, Java,C# The array is stored as a sequence of arrays consisting of rows (row by row) int x[2][7]; // declare as 2 rows x 7 columns x * 在我之前有幾個元素 ?* 每個元素佔幾個 address? (現代電腦大多一byte一個address)
Accessing elements in 2D arrays (1/2) m, n should be constant in most languages int x[m][n]; • m=number of rows • n=number of columns • b=address of x[u][v] • Address of x[i][j] (row major order) is b + ((i-u)*n + j-v) * (size of entry) 通常用u=0, v=0 寫個C/C++小程式然後 gcc -S file.c 或 g++ -S file.cpp編譯成 Assembly code 研究 Brookshear page 306
Accessing elements in 2D arrays (2/2) The formula for finding the address of an element is i = b + s*n s*n = s*e*k + s*n’ = s*(e*k + n’) e = number of elements in a row or column k = number of rows or columns to be skipped n’ = the number of elements between the target element and the beginning of a row or column where b = base address s = size of each element n = the number of elements between the target element and the base address * 在我之前有幾個元素 ?* 每個元素佔幾個 address? (現代電腦大多一byte一個address)
0 1 2 3 0 1 2 target element Example e*k + n’ = 6 skip 6 elements 在我之前有6個 Row-major order: e = 4, k = 1, n’ = 2 Column-major order: e = 3, k =2, n’ = 1 e*k + n’ = 7 skip 7 elements 在我之前有 7 個 Base address
Dynamic Array ? • Most Languages do NOT support dynamic array • Use pointer in C/C++ int *p; p = new int[ numberOfElementsNeeded ]; • In Java, array is an object; create it dynamically! int p[ ]; // p is only a reference p = new int[ numberOfElementsNeeded ]; Q: C++ STL 的 vector 如何“自動長大” ?
Represent the data and Store the data Primitive data: (大部份電腦用硬體就可處理的) • bit, byte, word, double word, quad word • C: char, short, int, long, float, double,[long double], and pointer of any data type • C++: bool, + ALL types in C • Java: boolean, byte, char(16 bits),short(16 bits) • int(32 bits), long(64 bits), float(32 bits), double(64 bits) Abstract data (User defined data type): List, Deque, Stack, Queue, Tree, …
結構與型別 (如何自定資料結構?)solution: struct • 只能用前述基本 data type 嗎? • User defined data type? • 考慮寫程式處理全班成績資料, 包括加總平均並排序(Sort)然後印出一份照名次排序的以及一份照學號排序的全班資料 • 如何做 Sort (排序) ? • Sort 時兩學生資料要對調, 要如何對調? • 有很多 array ? 存學號的 array, 存姓名的 array? 存成績的 array? … Bubble sort, Insertion sort, Selection sort
So, Why using struct in C ? Why using class in C++ ? • Struct 可以把相關資料 group 在一起 struct student x[99], tmp; /* … */ tmp = x[i]; x[i] = x[k]; x[k] = tmp; • 增加程式可讀性 • 程式更容易維護 • 其實都不用到struct也能寫出所有程式! • How to define / use struct ? • Next slides User defined data type is a template for a heterogeneous structure
User-defined Data Type 考慮寫個程式處理學生的成績資料, 如何表示一個學生的資料? 想一想. . . #include <stdio.h> struct Student { long sid; char name[9]; /*可存四個Big5中文 */ float score[13]; /*每人最多修13 科 */ }; /*注意struct與class之 分號不能省掉*/ int main( ) { structStudent x; /* C++和 C99不用寫 struct*/ x.sid = 123; /* dot notation */ strcpy(x.name, "張大千") ; /*注意字串不能= */ /* 用 loop 把成績讀入 x.score[?] */ } 習慣上第一個字母大寫 //C 的字串不能直接用 = 複製! 也可用 memcpy( )
x pointer 6087 The pointer points to memory cell x which contains the value 6087. Dynamic Data Structures • Static Data Structures: Size and shape of data structure does not change • Dynamic Data Structures: Size and shape of data structure can change at Run time (How?) • Pointers: An integer which is a memory address points to some data unit. For example: the program counter in the CPU. (Intel CPU uses CS:IP)
Sequence • A group of elements in a specified order is called a sequence. • A tuple is a finite sequence. • Ordered pair (x, y), triple (x, y, z), quadruple, and quintuple • A k-tuple is a tuple of k elements. • List, Stack, Queue , Deque, Vector all are sequence data structures that their size may vary.
Figure 8.1 Lists, stacks, and queues C++ STL, Java 都有提供 iterator 可走遍整個資料結構 An iteratoris an object that enables a user to loop througha collection withoutaccessing the collection’s fields.
Lists • A list is a collection of entries that appear in sequential order. • Can be an ArrayList or be a LinkedList. • ArrayList: List stored in a homogeneous array; also known as Contiguous list • LinkedList: List in which each entries are linked by pointers • Head pointer: Pointer to first entry in list • NIL pointer: A “non-pointer” value used to indicate end of list (NULL or 0) • List in C++ STL vs. List in Java • In C++ STL, list is a template class which implements as a linkedlist • In Java, java.util.List is a interface which extends from java.util.Collection interface; java.util.ArrayList and java.util.LinkedList both implements List interface and thus they are both Lists.
ArrayList -- Contiguous Storage of a List • List entries are stored consecutively in memory : ArrayList • Advantage: simple, quick random access • Disadvantage: insertion and deletion difficult. Java.util.ArrayList is NOT a Queue because it does NOT implement java.util.Queue interface; However, java.util.LinkedList is a Queue and thus the following statement is correct: Queue<Integer> gg = new LinkedList<Integer> ( );
Linked Lists • Each list entry contains a pointer to the next entry • List entries can be stored in any part of memory. • There is a special head pointer for the first entry and a Nil pointer in the last entry. java.util.LinkedList is a Queue and thus the following statement is correct: Queue<Integer> gg = new LinkedList<Integer> ( ); However, the following is WRONG ! Queue<Integer> gg = new ArrayList<Integer> ( ); // wrong In JDK 1.5, LinkedList implements Queue; In JDK 1.6, it implements Deque which extends Queue
Figure 8.9 The structure of a linked list Singly LinkedList Java 的 LinkedList 與 C++ STL 的 list 都是doubly linked list
Figure 8.10 Deleting an entry from a linked list Singly LinkedList Figure 8.11 Inserting an entry into a linked list
head p Add before this node head tmp p Inserting an entry into a doubly linked list 共有四個 Link 要改 ! What about delete ?
Stacks • A stack is a list in which all deletions and insertions occur at the same end. • Alternative name: LIFO structure • Last In First Out • Top: place where insertions and deletions occur. • Base: opposite end to the top. • Insertion=push, deletion = pop. A stack in memory
Stacks and Procedures/Functions Call • When a procedure P is called a pointer to the return location is pushed onto a stack. • If a second procedure Q is called from P then a pointer to the return location for Q is pushed onto the same stack. • When a procedure finishes, the return location is popped from the stack. 注意gcc/g++ 翻譯出的 組合語言格式與 Intel 公佈的或是Microsoft MASM 格式不同!除了指令名稱略不同外,指令中運算元(operands)的 左右順序相反! tcc –S filename.c tcc –S filename.cpp gcc –S filename.c g++ –S filename.cpp
C++ STL stack contains a deque [, vector, list ] template <class T, class T38=deque<T>> class stack { protected: T38 c; // the actual data is here public: typedef typename T38::value_typevalue_type; typedef typename T38::referencereference; typedef typename T38::const_referenceconst_reference; stack( ):c( ) { } explicit stack(const T38& seq) : c(seq) { } bool empty( ) const { return c.empty( ); } int size( ) const { return c.size ( ); } reference top( ) { return c.back( ); } const_reference top( ) const { return c.back( ); } void push(const value_type& x) { c.push_back(x); } void pop( ) { c.pop_back( ); } }; // class stack stack<int> gg; stack<int, vector<int> > yy;
Stack in java.util.*; package java.util; public class Stack<E> extends Vector<E> { public Stack() { } public boolean empty() { return size( ) == 0; } public E push(E item) { addElement(item); return item; } public synchronized E pop( ) { E obj; int len = size( ); obj = peek( ); removeElementAt(len - 1); return obj; } public synchronized E peek( ) { int len = size( ); if (len == 0) throw new EmptyStackException(); return elementAt(len - 1); } public synchronized int search(Object o) { int i = lastIndexOf(o); if (i >= 0) { return size( ) - i; } return -1; } private static final long serialVersionUID = 1224463164541339165L; } // class Stack
Vector in java.util.*; package java.util; public class Vector<E> extends AbstractList<E> implementsList<E>, RandomAccess, Cloneable, java.io.Serializable { protected Object[ ] elementData; // Java 的參考是類似C 的指標 protected int elementCount; protected int capacityIncrement; // if 0, double the capacity (array length) private static final long serialVersionUID = -2767605614048989439L; public Vector(int initialCapacity, int capacityIncrement) { super( ); // invoke constructor of my father if (initialCapacity < 0) throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity); this.elementData = new Object[initialCapacity]; this.capacityIncrement = capacityIncrement; } public Vector(int initialCapacity) { // default increment 0 will double capacity this(initialCapacity, 0); } public Vector( ) { this(10); } // Default capacity is 10 public synchronized int size( ) { return elementCount; } // . . .
Stack 應用 again • 使用堆疊把 infix postfix • Stack 內放運算符號和左括號, 注意優先權 • Operand 直接輸出 • 使用堆疊計算 postfix 的答案 • Stack 內放 operand 值 (value, 整數或實數) • 遇到 operator 就把 operand pop出來算 • 算出的 value 再 push 進去
p long x next Handling Pointers struct Student { long x; struct Student * next; }; • Assign the value 4 to x: p->x = 4; // C, C++ p.x = 4; // Java, C# • Initialise a pointer q to point to the next item: q = p->next; // C, C++ q = p.next; // Java struct Student * p;
Print a Linked List in Reverse Order Procedure reverseprint(L) p = head(L) While p<>Nil do push(p.data, stack) p=p.next EndWhile While stack not empty Print(pop(stack)) EndWhile End Procedure
Queue • A queue is a list in which insertions are performed at one end (tail) and deletions are performed at the opposite end (head). • FIFO: First In, First Out tail: insertions tail pointer: to next unused location head: deletions head pointer: to head item
Circular Queue empty vs. full EMPTY QUEUE [3] [2] [3] [2] J2 J3 [1] [4] [1] [4] J1 [0] [5] [0] [5] front = 0 front = 0 rear = 0 rear = 3 What if Queue is full ?
Leaveone empty space when Queue is full Why? FULL QUEUE FULL QUEUE [2] [3] [2] [3] J8 J9 J2 J3 J7 [1] [4][1] [4] J1 J4 J6 J5 J5 [0] [5] [0] [5] front =0 rear = 5 front =4 rear =3 How to test when queue is empty? How to test when queue is full?
void enqueue(int front, int *rear, element item){/* add an item to the queue */ *rear = (*rear +1) % MAX_QUEUE_SIZE; if (front == *rear) /* reset rear and print error */ return; } queue[*rear] = item; } Enqueue in a Circular Queue
element dequeue(int* front, int rear){ element item; /* remove front element from the queue and put it in item */ if (*front == rear) return queue_empty( ); /* queue_empty returns an error key */ *front = (*front+1) % MAX_QUEUE_SIZE; return queue[*front];} Dequeue from Circular Queue
Trees Earth Europe N. America Africa Asia S. America China Chad UK France Peru USA India Beijing Paris London Lima
Definition of a Tree • A tree is a connected undirected graph in which there are no circuits. • A rooted tree is a tree in which one node is selected as the root. • In computing, trees are usually rooted.
Terminology • Node, root, leaf, terminal node, parent, child. • Ancestor: Parent, parent of parent, etc. • Descendent: Child, child of child, etc. • Siblings: Nodes sharing a common parent • Subtree: a node together with all the nodes below it. • Depth: number of nodes on longest path from root to a leaf. • Binary tree: each node has at most two children.
Binary Trees • A binary tree is a finite set of nodes that is either empty or consists of a root and two disjoint binary trees called the left subtreeand the right subtree. • Any tree can be transformed into binary tree. • by left child-right sibling representation • The left subtree and the right subtree are distinguished.
A A A B B B C C F G D E D E H I Samples of Binary Trees Complete Binary Tree 1 2 Skewed Binary Tree 歪斜樹 3 4 4 5 What is a Full Binary Tree ?
Figure 8.16 The conceptual and actual organization of a binary tree using a linked storage system
Figure 8.17: A tree stored without pointers 適合大部份 node 都存在的 tree : complete tree, full tree
Figure 8.18: A sparse, unbalanced tree shown in its conceptual form and as it would be stored without pointers 大部份空間都浪費掉了!
Recursive Printing of a Binary Tree Procedure printTree(tree) If (tree not empty) printTree(left subtree) print(root) printTree(right subtree) EndIf EndProcedure In-order, pre-order, post-order traversal