E N D
ARRAYS, RECORDS AND POINTER CHAPTER 4 CHAPTER 4
Introduction data structure is classified as either linear or non linear, there are two ways of representing linear structures in memory.One way is to have linear relationship between the elements represented by means of sequential memory location. These are called array. The other way is to have linear relationship between the elements are represented by means of pointer or links. These structures are called linked list. Here we discuss array and different operation performed on array. CHAPTER 4
Linear Arrays A linear array is a list of finite numbers n of homogenous data elements . Array has a set of index and values. data structure For each index, there is a value associated with that index. Index is used to reference an element in memory location. Values are data elements stored in memory. representation (possible) implemented by using consecutive memory. The number n of elements is called the length or size of the array, if not defined we assume index started from 1,2…..,n. in general the length or the number of data elements of array can be obtained from index set by formula Length = UB – LB + 1 UB is upper bound the largest index, and LB is lower bound the smallest index. elements may of an array A denoted by subscript notation A1, A2, ----------------------- An CHAPTER 4
in Pascal elements of data can be represented A[1], A[2] --------------------- A[n] we usually use subscript notation to represents.The fig shows array of numbers start from 0 to 19. which contain data in sequential memory location. CHAPTER 4
1 2 3 4Fig shows four data elements contains data in memory. 247 56 300 500 CHAPTER 4
Some programming languages e.g ( Fortran and Pascal ) allocate memory space for arrays statically during program compilation hence the size of array is fixed during program execution. On the other hand some programming languages allows one to read an integer n and then declare an array with n elements; such programming languages are said to allocated memory dynamically.Representation of Linear arrays in memory let La be a linear array in memory of the computers, and memory of computer is simply a sequence of address location and we use notation LOC(LA[K]) = address of element LA[K] of the array LAbecause these stores in successive memory cells so only we needs to keep track of the first element of LA, Denoted by Base(LA) and called the base address of LA. Using this address computer calculates the address of elements of LA by following formulaLOC(LA[K]) = Base (LA) + w(K- lower bound)where w is the number of words per memory cell for the array LA. CHAPTER 4
Traversing Linear Arrays. • Let B be a collection of data elements stored in memory, and we want to print the content of each elements of B or we want to count the numbers of elements of B. this can be done by Traversing B, that is by accessing and processing each elements of B exactly once. The following algorithm traverses a linear array LA. • Traversing Linear Array , LA is an array with UB and LB. the algorithm traverses LA • 1. Set K:= LB • 2. repeat step 3 and 4 while K ≤ UB • 3 apply process to LA[K] • 4. set K:= K+1 • End of step 2 • 5. Exit CHAPTER 4
Inserting and deleting Inserting an elements at the end of a linear array can be easily done provided the memory space allocated for the arrays is large enough to accommodate the additional element. If we need to insert an element in the middle of array. Then on the average half of the elements must be moved downward to new location to accommodate the new elements and keep the order of other elements, similarly deletion element at end is not difficult but deletion in middle of array would required that each subsequent element be moved one location upward in order to fill up the array. Algorithm of inserting into linear array. INSERT (LA,N, K, ITEM) • Set J:=N • Repeat step 3 and 4 while J ≥ K • [Move jth element downward ] set LA[J+1] := LA[J] • Decrees counter set J:= J-1 end of step 2 loop • Insert element set LA[K] := ITEM • [Reset N] N:= N+1 • EXIT CHAPTER 4
Following algorithm insert a data element ITEM into the kth position in a linear array LA with N elements. The first four step create space in LA by moving downward one location each element from the kth position. These elements move in revers order first LA[N], then LA[N-1]…. LA[K]; we first set J=N and then using J as counter decrease J as counter, decrease J each time the loop is executed until J reach K. the next step 5 insert ITEM into the array in the space just created. Before exit the number N of elements in LA is increased by 1 to account for the new element. CHAPTER 4
Sorting Bubble Sort Let A be a list of n numbers, sorting A refers to the operation of rearranging the element of A so they are in increasing order, i.e so that A[1]<A[2]<A[3] -------<A[N] e.g. A contain 8, 4, 19, 2,7, 13, 5 ,16 After sorting A is the list 2,4,5,7,8,13,16,19 sorting may seems to be a trivial task. Here are many algorithms for sorting data in array, we will discuss the bubble sort, from above definition we mean sorting refers to arranging numerical data in increasing order or arranging data in decreasing order and also arranging nonnumeric data in alphabetical order. Bubble sort Suppose the list of numbers A[1], A[2]--------- A[N] is in memory bubble sort algorithm works as follows CHAPTER 4
Working of Bubble Sort Step 1. compare A[1] and A[2] and arrange them in the desired order, so that A[1]<A[2].then compare A[2] and A[3] and arrange them so that A[2]< A[3]. then compare A[3] and A[4] and arrange them so that A[3]< A[4].continue until we compare A[N-1] with A[N] and arrange them so that A[N-1] < A[N]. Step involves n-1 comparison in this step the largest element is bubbled up to nth location, after step 1 largest element is at nth location Step2. in this step same steps repeated with less than one comparison from step one i.e A[N-2] comparison required. And compare A[N-2] and A[N-1]. We performs these comparison up to step N-1 Step N-1. compare A[1] with A[2] and arrange them so that A[1]<A[2] The process of sequentially traversing through all or part of a list is frequently called “pass” so each of the above steps is called pass. CHAPTER 4
Algorithm & its complexity CHAPTER 4
Searching; Linear Search Searching refers to the operation of finding the location LOC of ITEM in Data, or printing some message that item does not appear. The search is successful if item is found. There are many searching algorithm available we discuss simple searching algorithm called linear search and then we study binary search. Binary search is faster searching algorithm and better than linear search. The complexity of searching algorithms is measured in term of numbers f(n) of comparisons required to find ITEM in DATA where DATA contains n elements. Linear Search Suppose Data is linear array with n elements., the most popular method to find the given item in data is to compare item with each element of data one by one. That is first we text whether DATA[1]=item , and we continue to n elements until we find required data. In this method we travers each element un till we found required data. To simply this assign ITEM to DATA[N+1], the position following the last element of DATA. LOC=N+1 where LOC denotes where item first appear in DATA, signifies that search is unsuccessful. The purpose is to avoid repeatedly testing whether or not we have reached the end of array DATA. This way search must eventually success. CHAPTER 4
Algorithm of Linear Search (linear search) LINEAR(DATA, N, ITEM, LOC) here DATA is a linear array with N elements and ITEM is a given item of information. This finds the location LOC of ITEM in DATA, or sets LOC=0 if search is unsuccessful. 1. [insert ITEM at the end of DATA) set DATA[N+1] = ITEM. 2. [Initialize counter set] LOC=1 3. [Search for item] repeat while DATA[LOC]#ITEM set LOC = LOC+1 [end of loop] 4. [successful?] if LOC = N+1, then : set LOC := 0 5. exit. Complexity of this algorithm is by measure by the number of comparison required to find ITEM in DATA with n elements. Two important cases is average case and worst case. Worst case is that when searching perform through entire array when item does not appear in data. F(n) = n+1. Average case is f(n) = n(n+1)/2 . 1/n = n=1/2 CHAPTER 4
Binary Search & complexity Suppose data is an array which is sorted in increasing numerical order or equivalently alphabetically. Then there is an efficient searching algorithm called binary search, which can be used to find the location LOC of a given item of information in DATA. The binary search algorithm applied to our array data works as follows. During each stage of our algorithm, our search for item is reduce to a segment of element of data. Data[beg], Data[beg+1]--------,Data[end] The variables beg and end denotes respectively the beginning and end locations of the segments under consideration. The algorithm compare item with the middle element data [mid] of the segments where mid is obtained by mid = INT((beg) + (end)/2) If data[mid] = item then the search is successful and we set LOC = mid otherwise a new segments of data is obtained by (a) If item < data[mid], then item can appear only in the left half of the segment Data [beg], data[beg+1] -------data[mid-1] So we reset END : = mid-1 and begin searching again. CHAPTER 4
(b) If item > data [mid], then item can appear only in the right half of the segment Data[mid+1] ,data[mid+2] ------- data[end] So we reset Beg : = mid+1 and begin searching again. Initially we begin with the entire array data i.e we begin with beg = 1 and end = n or more generally with beg = LB and end =UB If item is not in data then eventually we obtain end<beg This condition signals that search is unsuccessfully, and in such a case we assign LOC:= null . Null mean the value is lies outside the set of indices of data. Complexity of binary search algorithm The complexity is measured by the number f(n) of comparison to locate item in data where data contains n elements. Observe that each comparison reduce the sample size in half. Hence require at most f(n) comparison to locate item where f(n)= [log2n]+1 That is, the running time for the worst case is approximately equal to log2n. The running time for average case is approximately equal to the running time for the worst case. Limitation of Binary Search Since the binary search algorithm is very efficient( e.g it requires only 20 about 20 comparison with an initial list of 1000 000 elements) CHAPTER 4
Why would one want to use any other search algorithm? The reason is that (1) the list must be sorted and (2) one must have direct access to the middle element in any sub list. This means one must essentially use a sorted array to hold data. But sorted array is expensive when many insertion or deletion take place, for this reason one may use different data structure such as linked list or binary search tree. (Binary Search) BINARY (DATA, LB , UB, ITEM,LOC) Here DATA is a sorted array with lower bound LB and upper bound UB, and ITEM is a given item of information. The variable BEG, END and MID denote, respectively, the beginning, end and middle location of segment of elements of DATA. This algorithm finds the location LOC of ITEM in DATA or sets LOC = Null 1. [ initialize segments variables] Set BEG = LB, END = UB and MID = INT((BEG+END)/2). 2. Repeat step 3 and 4 while BEG ≤ END and DATA[MID] ≠ ITEM 3. If ITEM < DATA[MID] then : Set END := MID-1 else BEG: = MID+1 [end of if structure] CHAPTER 4
4. Set MID : = INT((BEG+END)/2) [End of steps 2 loop] 5. If DATA[MID] = ITEM then: set LOC : = MID else set LOC : = NULL [End of if structure.] 6. Exit CHAPTER 4
Pointers ; Pointer Arrays Let DATA be an array. A variable P is called a pointer if P “points” to an element in DATA, i.e. if P contains the address of an element in DAT. An array PTR is called a pointer array if each element of PTR is a pointer. Pointers and pointer arrays used to facilitate the processing of information in DATA. Arrays whose rows or columns begin with different numbers of data elements and end with unused storage location are said to be jagged. This useful tool can be discuss by an example. There is also another way to store list in memory, that list is placed in a linear array, one group after another. This method is space efficient, also the entire list can easily processed one by one. But there is no way to access any particular group; e.g there is no way to find and print only the name in the third group. A modified version shown in next slide. That is the names are listed in a linear array, group by group, except some marker, such as three dollar sign used to indicate end of group. This required extra memory but one can easily access particular data. CHAPTER 4
Group 1 Draw back is that list is traversed from beginning in order to recognized the group. Group 2 Group 3 Group 4 CHAPTER 4
( * * * * 0 0 0 0 0) ( * * * * * * * * *) ( * * 0 0 0 0 0 0 0) (* * * * * * * 0 0) CHAPTER 4
Pointer Array The two space efficient data structure can be easily modified so that individual group can be indexed. This is accomplished by using a pointer array, which contain the location of different groups or more specifically the location of the first element in the different groups. CHAPTER 4
Records, Record Structure Collection of data are frequently organized into a hierarchy of fields, records and files. A record is a collection of related data items, each of which is called a field or attributes, and a file is collection of similar records. Each data item itself may be a group item composed of sub items, those items which are indecomposable are called elementary item or scalars. The names given to various data items are called identifier. Although a record is a collection of data item, it is differ from linear array in the following ways. • A record may be a collection of nonhomogeneous data. • The data item in a record are indexed by attribute names, so there may not be a natural ordering of its elements. CHAPTER 4
Multidimensional arrays Linear array also called one-dimensional arrays, since each element in the array is referenced by a single subscript. Most programming languages allow two-dimensional and three-dimensional array. i.e. arrays where elements are referenced respectively by two and three subscripts. Two Dimensional array A two dimensional m x n array A is a collection of M. N data element such that each element is specified by a pair of integers (such as J ,K) called subscripts. The element of A with first subscript j and second subscript k will be denoted by A[J, K] It is also called matrices in mathematics and table in business applications. There is standard way of drawing a two dimensional m x n array. 1 2 3 4 1 A[1,1] A[1,2] A[1,3] A[1,4] 2 A[2,1] A[2,2] A[2,3] A[2,4] 3 A[3,1] A[3,2] A[3,3] A[3,4] Two dimensional 3 x 4 array A CHAPTER 4
Representation of two- dimensional array in memory Let A be a two dimensional m x n array. Although A is pictured as rectangular array of elements with m rows and n columns, the array will be represented in memory by a block of m . n sequential memory locations. The programming language will store the array A either (1) column by column is what called column-major order or row by row in major-row order. Fig shows two ways to represent array, either by row or column. Subscript column 1(1,1) to (3,1) Row 1 (1,1) to (1,4) column 2 (1,2) to (3,2) Row 2 (2,1) to (2,4) column 3 (1,3) to (3,3) column 4 (1,4) to (3,4) Row 3 (3,1) to (3,4) CHAPTER 4
Linear array do not keep track of the address LOC(A[k]) of every element A[k], but does keep the track of bas( A), the address of first element. formula LOC(A[k] = base (A) + w(k-1) Similar situation holds for two dimensional m x n array A. that is, the computer keeps track of base (A) the address of the first element A[1,1] of A and compute the address LOC(A[ j ,k ] ) of A[ j , k] by formula. Column major-order LOC(A[ j ,k ] ) = base (A) + w[M(k-1) + (J-1) Or formula Row major-order LOC(A[ j ,k ] ) = base (A) + w[N(j-1) + (k-1) CHAPTER 4