410 likes | 604 Views
HKOI 2004 Team Training. Data Processing 1 & String Functions (Intermediate Level). Overview. Introduction Data types String functions Data processing technique I. Introduction. Data Processing means to process some data with some “general” techniques. Sorting and searching
E N D
HKOI 2004 Team Training Data Processing 1 &String Functions (Intermediate Level)
Overview • Introduction • Data types • String functions • Data processing technique I
Introduction • Data Processing means to process some data with some “general” techniques. • Sorting and searching • Simple calculations • Manipulating data in a particular way • In HKOI, we call it DaP to avoid collision between Dynamic Programming (DyP) or other DPs.
Introduction • Unlike other topics going to taught in HKOI, there is NO general method to solve a DaP problem. • That is, DaP is not an algorithm. • We classify problems which ask you to deal with some data in a quite straight forward approach as DaP problems.
Introduction • DaP is the foundation of OI programming. (or even computer?) • Trains your ability to: • Analyse and solve problems • Choose the best algorithm & data structure • Write programs that implement your idea • Practice makes perfect. Practice writing DaP problems is important for you to write future OI programs efficiently.
Data Types • It is important to select the best data type for writing every program. • Data range • Time needed for each operation • Memory usage
Data Types (Overview) • Categories of data types • Ordinal types: integer, char, boolean, ... • Real types: real, extended, comp, ... • String • Array • Record • Others like pointer, set, enum, object, ... • Pascal and C/C++ types are similar but may not be equivalent, so do the examples given in this notes.
Data Types (Pascal) • Data types taught in the CE syllabus • integer : 16-bit signed integer • [215,215-1] or [-32768,32767] • real : 48-bit real number • ± 2.9e-39 – 1.7e38 • char : 8-bit character • boolean : 8-bit true or false • string : 256-byte string • 1 byte length and 255 bytes characters
Data Types (Pascal) • Some more useful ordinal types • integer : 16-bit, signed • [-215,215-1] or [-32768,32767] • byte : 8-bit, unsigned • [0,28-1] or [0,255] • longint : 32-bit, signed • [-231,231-1] or [-2147483648,2147483647] • shortint : 8-bit signed • word : 16-bit unsigned
Data Types (Pascal) • Some more useful real types • real : 48-bit / 6-byte real number • ± 2.9e-39 – 1.7e38 • double : 64-bit / 8-byte real number • ± 5.0e-324 – 1.7e308 • extended : 80-bit / 10-byte real number • ± 3.4e-4932 – 1.1e4932 • comp : 64-bit signed integer • -9,223,372,036,854,775,808 (-263) – 9,223,372,036,854,775,807 (263-1)
Data Types (GCC IA32/x86) • C/C++ types are compiler dependent. • Examples of GCC IA32 ordinal types: • boolean bool (8-bit) • byte char (8-bit) • integer short / short int (16-bit) • longint int / long / long int (32-bit) • comp long long (64-bit) • long long is an ordinal type in GCC/G++. • Add “signed / unsigned” before an ordinal type to represent [-2n-1,2n-1-1] or [0,2n-1].
Data Types (GCC IA32/x86) • Examples of GCC real types: • float : 32-bit / 4-byte real number • ± 3.4e-38 – 3.4e38 • double : 64-bit / 8-byte real number • ± 1.7e-308 – 1.7e308 • double is enough for most cases. • long double : 96-bit / 12-byte real number • Very precise! (Sorry, I can’t find the range.)
Data Types (How to choose?) • Use ordinal types if possible. • More accurate • Faster for most operations • Use most accurate real type if possible • Rounding error is unavoidable • More bits means less error accumulated • Use ordinal types to replace real types • Avoid errors, useful for money calculations • Multiply the “real” number by 10, 100, ...
Data Types (How to choose?) • Be careful of overflow • CHECK the extreme values when you read a question. Do some calculations yourself. • Hint: To avoid careless overflows, use 32-bit integer (longint/int) for most programs, unless memory usage is highly restricted. • Turbo Pascal programs • Restricted memory usage in competitions • Personal observation: • Real types are seldom used in NOI/IOI.
Data Types (Array) • Useful for storing and processing data. • Arrays and loops usually come together to make programming easy. • Arrays can be multi-dimentional. • var a : array[1..100000] of longint; • var b : array[1..10,0..20,-10..10] of string; • This array occupies 1,128,960 bytes of memory. • C/C++ arrays must be 0-based. • How can you represent a and b in C/C++?
Data Types (Record) • A record is to bundle some information together into 1 big memory segment. var r : record i,j : integer; k : real; a : array[1..3] of string; end; begin r.a[2] := 'Hello!'; end. ------------------------------------------------ struct RecName { int i,j; double k; string a[3]; } r; int main() { r.a[1] = "Hello!"; return 0;}
Data Types (Parallel Arrays) • Apart from using records, we may also use multiple arrays of different types. For the same index in these arrays, they represent different information of the same object. • name : array[1..50] of string; • marks : array[1..50] of integer; • Similar C/C++ implementations are: • char name[50][256]; • int marks[50];
String • String is implemented differently in Pascal, C and C++ • Pascal • 1 byte length + 255 bytes array of characters. • Total size is 256 bytes by default. If length is specified, size is n+1 bytes. • var s : string; { size = 256 bytes } • var t : string[20]; { size = 21 bytes } • C • Null-terminated array of characters. • char s[256], t[21]; // 1 more bytes for '\0'
String • C++ • Provided by Standard Template Library (STL) • An object with both data and functions. • The memory storing the string itself is a vector of characters. • Vector is another advanced data structure implemented using OOP code in STL! So forget it ... • C++ string is a bit slower, but you may treat C++ string as efficient as C string in terms of run time complexity. • You’ll know what is “run time complexity” later. • #include <string> • string s,t; // Cannot fix max length
String Operations • Definitions for all examples: • Pascal • var s, t, u, p : string; • C • #include <string.h> or #include <cstring> • char s[256], t[256], u[256], p[256]; • C++ • #include <string> • string s, t, u, p; • s="abcde"; t="12"; u="c"; p="ab12#ab12";
String Operations (Assignment) • Assignment: • Pascal: assign as normal • s := 'abcde'; t := p; { assign a string } • s[3] := '9'; { assign a character } • C: use strcpy() function • strcpy( s, "abcde" ); // assign string s • strcpy( t, p ); // assign string t • s[2] = '9'; // assign a character • C++: assign as normal • s = "abcde"; t = p; // assign a string • s[2] = '9'; // assign a character
String Operations(Get Sub-string) • Get part of a string: • Pascal: use copy() function • s := copy( p, 3, 5 ); { s = '12#ab' } • C: use strncpy() function • strncpy( s, &p[2], 5 ); // s == "12#ab" • You have to use some pointer stuff in C! • C++: use string.substr() function • Note the 2 usages of string.substr() • s = p.substr( 2, 5 ); // s == "12#ab" • s = p.substr( 3 ); // s == "2#ab12"
String Operations (Get Length) • Get length of string: • Pascal: use length() function • k := length(s); { k = 5 } • c := s[0]; { c = chr(5) } • s[0] gives the length in char data type. • C: use strlen() function • k = strlen(s); // k == 5 • C++: use string.length() function • k = s.length(); // k == 5
String Operations (Concatenation) • Concatenate strings together: • Pascal: use concat() function or + operator • s := concat( s, t, u ); { s = 'abcde12c' } • s := s+t+u; { concat by “+” } • They are totally equivalent. • C: use strcat() function • strcat(s,t); strcat(s,u); // concat one by one • C++: use + operator • s = s+t+u; // normal way • s += t+u; // another way
String Operations (Comparison) • Compare strings: • Compare the characters one by one until not equal or end of string. • Pascal: use >, <, >=, <=, =, <> • s > p { true } • C: use strcmp() function • =0:equal, >0:greater, <0:smaller • k = strcmp(s,p); // k > 0 • C++: use >, <, >=, <=, ==, != • s > p // true
String Operations (Insertion) • Insert another string into a string: • Pascal: use insert() procedure • insert( s, t, 3 ); { s = 'ab12cde' } • C: use strcpy() and strncpy() functions • I can’t find such a function. Is there any? • May use a few strcpy() and strncpy() instead. • C++: use string.insert() • s.insert( 2, t ); // s == 'ab12cde' • More overloaded .insert() for you to discover!
String Operations (Deletion) • Delete a part from a string: • Pascal: use delete() procedure • delete( s, 2, 3 ); { s = 'ae' } • C: use strcpy() and strncpy() functions • I can’t find such a function. Is there any? • May use a few strcpy() and strncpy() instead. • C++: use string.erase() • s.erase( 1, 3 ); // s == 'ae' • More overloaded .erase() for you to discover!
String Operations(Search Sub-string) • Find a string from a string: • Pascal: use pos() function • k := pos( '12', p ); { k = 3 } • k := pos( 'x', p ); { k = 0 for not found } • C: use strstr() and pointer operations • r = strstr( p, "12" ); // returns &p[2] • k = r – p; // k == 2 • r = strstr( p, "x" ); // returns null, r == 0 • C++: use string.find() function • k = s.find( "12" ); // k == 2 • s.find( "x" ) == string::npos // true
String Operations(Data Conversion) • Data conversion: • Pascal: use val() and str() procedures • val( t, k, e ) { e is an integer for error } • str( k, t ) { error-free to convert k to t } • C: use atoi(), atof(), sscanf(), sprintf(), ... • k = atoi( t ); // k == 12, atof() for double • Remember to #include <stdlib.h> or <cstdlib> • sscanf( t, "%d", &k ); // similar to scanf() • sprintf( s, "%d", k ); // similar to printf() • Remember to #include <stdio.h> or <cstdio> • C++: use string.c_str() or <sstream>
String Operations (Misc) • Misc: • Pascal • copy( s, 1, 1 ) <> s[1] { although both are 'a' } • copy( s, 1, 1 ) is a string, s[1] is a char. • Result truncated if length exceeded. • PChar is a pointer to a null-terminated string in heap memory in Turbo Pascal. • AnsiString is a null-terminated string of unlimited length that all original string functions can be applied. • var t : ansistring; • Only available in Free Pascal.
String Operations (Misc) • Misc: • C • Be careful not to overwrite the '\0'! • C++ • s.c_str() gives the C string equivalent of s. • Find more fancy stuff yourself!
DaP Techniques (Part I) • Read the question CAREFULLY. • DaP questions are often very complicated and sometimes misleading. Have a clear mind! • Think how you can process the input to give the output. • Algorithm • Decide how the input should be stored. • Data structure
DaP Techniques (Part I) • Annoying things in DaP • Lengthy and problem description • Annoying and frustrating • Dirty input • Mixing strings and numbers • Unknown length of data • Complicated processing part • Dirty output • Strange format
DaP Techniques (Part I) • Algorithm and data structure cooperate with each other. • Think both at the same time. • Consider the complexity: • Run time complexity • Memory complexity • Big-O notation is usually used. • Again, will be taught in the future. • Trying to consider the no. of operations in the worst case is acceptable for today.
DaP Techniques (Part I) • Think like a computer! • Do NOT be afraid of using lots of loopings, if-then-else’s, multi-dimensional arrays, ... • Get used to write programs with many levels of nesting NOW. • Important because you WILL have to write such programs throughout your OI training (and all other programming tasks).
DaP Techniques (Part I) • Typical structure of the main body of a program: • Input • Processing • Output • Follow the IPO module can make your program more structural. • However, this is not a must.
DaP Techniques (Part I) • Processing: • Input is usually stored for processing later. • But sometimes the values are used only once. • Then we may save time and memory by processing the data while inputting. • In the extreme case, input, processing and output may be mixed together.
DaP Techniques (Part I) • OI style is often used in OI programs. • Short variable names. • No need to have meaning. • Usually consists of only 1 character. • No need to write comments. • Write everything in the main program, especially no procedures/functions for things that will only be executed for once. • Will be better taught by Unu in the future.
DaP Techniques (Part I) • BUT OI style doesn’t mean unstructured. • Remember, DaP programs are often so complicated. You will get lost if your program is unstructured. • Your program must be maintainable by yourself, at least during the competition! • Appropriate use of procedures and functions. • Good indentation and suitable comment. • Consistent variable/function naming scheme. • May even use OOP technique!
Exercises (AM) • Let’s do easy things before lunch break! • 30189 – Minesweeper • Example for the class. • 1005 – Napster Cheating • 1006 – Octopus • 30300 – Ecological Premium • 1015 – Parentheses Balance • Will be explained in Data Structure class. • But you may try!
Exercises (PM) • Let’s see some harder examples! • 2031 – Narrow Range Broadband • 2030 – Be Jeweled! • Related to DFS/BFS. Will be explained in future. • But you may try! • 20413 – Up and Down Sequences • 10000 – Somebody Save the Prince • Your DaP ultimate challenge!