270 likes | 366 Views
Data Types. Primitives. Aggregates. Integer Float Character Boolean Pointers. Strings Records Enumerated Arrays Objects. Strings. B e t s y b b b. Fixed length. Null terminated. Length field. Heap allocated. B e t s y 0. 5 B e t s y. B e t s y. String Allocation.
E N D
Primitives Aggregates • Integer • Float • Character • Boolean • Pointers • Strings • Records • Enumerated • Arrays • Objects
Strings B e t s y b b b • Fixed length. • Null terminated. • Length field. • Heap allocated. B e t s y 0 5 B e t s y B e t s y
String Allocation • Static length. • Blank fill as in Fortran, Pascal, etc. • Limited Dynamic length • grows to a limit • Dynamic length • no length restriction • reallocates from heap char x[4]; strcpy(x,”abc”); //OK strcat(x,”def”); // NO x=“abc”; x=“abcdefghij”;
Implementation • Can be viewed as primitive type • some machine language supports string operations at a level which treats them as primitives even though operations are slower • Sometimes requires both • compile-time descriptors • run-time descriptors • know the difference
Enumerated types • Usually implemented as integers. • Implied size limitation which is not a problem • (red, green, blue) red is 0, green is 1, etc • Strong typing sometimes creates ambiguity • desire types to be distinguished but for • weekday = (Mon, Tue, Wed, Thur, Fri); • classday = (Mon, Wed, Fri) • assignment ok one direction, but not other • I/O sometimes allowed, others not
Subrange • Sequence of an ordinal type • Mon..Fri • Used for tighter restriction of values than primitive types provide • subtype age is integer 0..150; • Sometimes compatible, others not • EXAMPLE: is age compatible with integer? • type age is new integer range 1..150; NO • type age is integer range 1..150; YES
Array Operations • ARRAY operations are infrequent except APL • Examples • elements (common) • entire array (as parameters/pointers) • slice (a row, column, or series of rows/columns) • APL • matrix multiplcation • vector dot product • add a scalar to each element
Allocation strategies • Static array • Fixed stack-dynamic • int x[20]; compile-time decision of size allocation • Stack dynamic • int x[n]; once allocated, size can’t change, but determined by n • Heap dynamic • array can grow dynamically and change subscript • ever been frustrated by the MAX size of array?
Subscript/subrange errors • Subscript bounds problems for arrays are one of our biggest programming nuisances • Checking for them at run-time is expensive • Even if within the range -> no assurance they are correct • Some languages such as c do NO checking • Consequence in programs is difficult/impossible to trace
Addressing • Storage is row-major or column-major order int A[2,3]; (1,1) (1,2) (1,1) (1,1) (1,2) (2,1) (2,1) (2,2) (2,1) (3,1) (3,1) (3,2) (2,2) (1,2) (3,1) (2,2) (3,2) (3,2)
Determining location Location (a[I]) = base address (a)+ (I- lowerbound)*element size 100 integer a[6]; [1] Assume size 4 bytes each starting at 100 [2] 104 108 [3] 112 [4] Loc(a[3])= 100 + (3-1)*4 = 108 116 [5] [6] 120 Most of this is compile-time!
2-d arrays (column major) Loc (a[I,J]) = base address (a) (I-lb1)*size element + (J-lb2)*size of column size of column=number rows allocated * size element 100 (1,1) 104 (2,1) 108 (3,1) 112 Loc (a[1,2]) = 100 + (1-1)*4 + (2-1)*3*4 = 100 + 0 + 12 = 112 (1,2) 116 (2,2) 120 (3,2)
Passing 2-d arrays as parameters • The receiving procedure needs to have DIMENSION information • Some languages are tightly bound and force that .. Pascal by requiring it to be a declared type • Others have strange rules • Fortran (column major) Called: SUBROUTINE PROCESS(A,N) INTEGER A(N,1) Caller: INTEGER A(10,20) CALL PROCESS(A,10)
Associative arrays • Not common… in perl • Uses a hash function • Stores Key and Value “gary” 47850 hash %salaries In math class: hash(key) = value or hash(“gary”)=47850 mary 55750 cedric 75000 gary 47850 %salaries{“gary”} -> 47850 perry 57000
Arrays as pointers in c • Use of array name in c is the same as a pointer to the beginning element • Incrementing the associated pointer increments by the true memory size • integers are 4 bytes • int * j; • j++; // increments j by 4.. assuming byte addressable
Example code in c Assign j to be the address of c[0] As long as the address of j is within the bounds of c int c[10], *j; for (j=c; j<&c[10]; j++) { *j = 0; } Increment j by size of integer Set the element to 0 for (int j=0; j<10; j++) { c[j] = 0; }
Records • Record operations • assignment • comparison • block operations without respect to fields • Strange syntax in c • Unions
Record pointers in c In declaring routine: teacher.age=35; Struct person{ int weight; int age; char name[20]; }; // not exact format person teacher; When passing to function and inside function: teacher->age=35;
Unions • Free unions • two names for the same place • it’s up to you to keep them straight • no support for checking • Discriminated unions • a value in the record indicates how to interpret the associated data. • Not always easy to check.. Sometimes not done
Ada example (p.231) rectangle:side1,side2 circle:diameter triangle:leftside, rightside, angle Discriminant(form) color filled
Sets • Bit fields implemented as binary values (below) • fast implementation • set operations are easy binary operations • try set union • limit to size of set related to binary ops Type colors = (red,blue,green,yellow,orange,white,black); colorset = set of colors; var set1 : colorset; set1 := [red,orange,blue]; implemented as ( 1 1 0 0 1 0 0 )
Pointers • Lots of flexibility • Data from heap • Difficult to manage what you are pointing at • Many languages strongly manage the types to which the pointers point • c doesn’t care • c++ does • Real problems are programmer management
Pointer problems Dangling reference: int *p1, *p2; p1 = new (int); p2=p1; delete(p1); Lost heap-dynamic: int *p1, *p2; p1 = new (int); p1 = p2; (lost) p1 p1 p2 p2
Handling Pointer Problems • Tombstones • always stays even after memory deallocated • never have a variable pointing at deallocated data Before cell After null cell tombstone
Handling Pointer Problems REFERENCE COUNTERS 3 pointers at same cell 2 pointers at same cell 3 2 cell cell Delete cell when reference count is 0 Other than efficiency, trick is with circular lists
Handling Pointer Problems GARBAGE COLLECTION Mark all w/0 Mark all pointed at w/1 Initial scenario 1 0 0 0 0 1 0 1 0 1