1 / 41

Data Processing Techniques in HKOI: Overview and Methods

Learn about data types, string functions, and processing techniques in HKOI programming, including sorting, calculations, and data manipulation. Understand the importance of selecting the right data type for efficient programming.

Download Presentation

Data Processing Techniques in HKOI: Overview and Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HKOI 2004 Team Training Data Processing 1 &String Functions (Intermediate Level)

  2. Overview • Introduction • Data types • String functions • Data processing technique I

  3. Introduction • Data Processing means to process some data with some “general” techniques. • Sorting and searching • Simple calculations • Manipulating data in a particular way • In HKOI, we call it DaP to avoid collision between Dynamic Programming (DyP) or other DPs.

  4. Introduction • Unlike other topics going to taught in HKOI, there is NO general method to solve a DaP problem. • That is, DaP is not an algorithm. • We classify problems which ask you to deal with some data in a quite straight forward approach as DaP problems.

  5. Introduction • DaP is the foundation of OI programming. (or even computer?) • Trains your ability to: • Analyse and solve problems • Choose the best algorithm & data structure • Write programs that implement your idea • Practice makes perfect. Practice writing DaP problems is important for you to write future OI programs efficiently.

  6. Data Types • It is important to select the best data type for writing every program. • Data range • Time needed for each operation • Memory usage

  7. Data Types (Overview) • Categories of data types • Ordinal types: integer, char, boolean, ... • Real types: real, extended, comp, ... • String • Array • Record • Others like pointer, set, enum, object, ... • Pascal and C/C++ types are similar but may not be equivalent, so do the examples given in this notes.

  8. Data Types (Pascal) • Data types taught in the CE syllabus • integer : 16-bit signed integer • [215,215-1] or [-32768,32767] • real : 48-bit real number • ± 2.9e-39 – 1.7e38 • char : 8-bit character • boolean : 8-bit true or false • string : 256-byte string • 1 byte length and 255 bytes characters

  9. Data Types (Pascal) • Some more useful ordinal types • integer : 16-bit, signed • [-215,215-1] or [-32768,32767] • byte : 8-bit, unsigned • [0,28-1] or [0,255] • longint : 32-bit, signed • [-231,231-1] or [-2147483648,2147483647] • shortint : 8-bit signed • word : 16-bit unsigned

  10. Data Types (Pascal) • Some more useful real types • real : 48-bit / 6-byte real number • ± 2.9e-39 – 1.7e38 • double : 64-bit / 8-byte real number • ± 5.0e-324 – 1.7e308 • extended : 80-bit / 10-byte real number • ± 3.4e-4932 – 1.1e4932 • comp : 64-bit signed integer • -9,223,372,036,854,775,808 (-263) – 9,223,372,036,854,775,807 (263-1)

  11. Data Types (GCC IA32/x86) • C/C++ types are compiler dependent. • Examples of GCC IA32 ordinal types: • boolean  bool (8-bit) • byte  char (8-bit) • integer  short / short int (16-bit) • longint  int / long / long int (32-bit) • comp  long long (64-bit) • long long is an ordinal type in GCC/G++. • Add “signed / unsigned” before an ordinal type to represent [-2n-1,2n-1-1] or [0,2n-1].

  12. Data Types (GCC IA32/x86) • Examples of GCC real types: • float : 32-bit / 4-byte real number • ± 3.4e-38 – 3.4e38 • double : 64-bit / 8-byte real number • ± 1.7e-308 – 1.7e308 • double is enough for most cases. • long double : 96-bit / 12-byte real number • Very precise! (Sorry, I can’t find the range.)

  13. Data Types (How to choose?) • Use ordinal types if possible. • More accurate • Faster for most operations • Use most accurate real type if possible • Rounding error is unavoidable • More bits means less error accumulated • Use ordinal types to replace real types • Avoid errors, useful for money calculations • Multiply the “real” number by 10, 100, ...

  14. Data Types (How to choose?) • Be careful of overflow • CHECK the extreme values when you read a question. Do some calculations yourself. • Hint: To avoid careless overflows, use 32-bit integer (longint/int) for most programs, unless memory usage is highly restricted. • Turbo Pascal programs • Restricted memory usage in competitions • Personal observation: • Real types are seldom used in NOI/IOI.

  15. Data Types (Array) • Useful for storing and processing data. • Arrays and loops usually come together to make programming easy. • Arrays can be multi-dimentional. • var a : array[1..100000] of longint; • var b : array[1..10,0..20,-10..10] of string; • This array occupies 1,128,960 bytes of memory. • C/C++ arrays must be 0-based. • How can you represent a and b in C/C++?

  16. Data Types (Record) • A record is to bundle some information together into 1 big memory segment. var r : record i,j : integer; k : real; a : array[1..3] of string; end; begin r.a[2] := 'Hello!'; end. ------------------------------------------------ struct RecName { int i,j; double k; string a[3]; } r; int main() { r.a[1] = "Hello!"; return 0;}

  17. Data Types (Parallel Arrays) • Apart from using records, we may also use multiple arrays of different types. For the same index in these arrays, they represent different information of the same object. • name : array[1..50] of string; • marks : array[1..50] of integer; • Similar C/C++ implementations are: • char name[50][256]; • int marks[50];

  18. String • String is implemented differently in Pascal, C and C++ • Pascal • 1 byte length + 255 bytes array of characters. • Total size is 256 bytes by default. If length is specified, size is n+1 bytes. • var s : string; { size = 256 bytes } • var t : string[20]; { size = 21 bytes } • C • Null-terminated array of characters. • char s[256], t[21]; // 1 more bytes for '\0'

  19. String • C++ • Provided by Standard Template Library (STL) • An object with both data and functions. • The memory storing the string itself is a vector of characters. • Vector is another advanced data structure implemented using OOP code in STL! So forget it ... • C++ string is a bit slower, but you may treat C++ string as efficient as C string in terms of run time complexity. • You’ll know what is “run time complexity” later. • #include <string> • string s,t; // Cannot fix max length

  20. String Operations • Definitions for all examples: • Pascal • var s, t, u, p : string; • C • #include <string.h> or #include <cstring> • char s[256], t[256], u[256], p[256]; • C++ • #include <string> • string s, t, u, p; • s="abcde"; t="12"; u="c"; p="ab12#ab12";

  21. String Operations (Assignment) • Assignment: • Pascal: assign as normal • s := 'abcde'; t := p; { assign a string } • s[3] := '9'; { assign a character } • C: use strcpy() function • strcpy( s, "abcde" ); // assign string s • strcpy( t, p ); // assign string t • s[2] = '9'; // assign a character • C++: assign as normal • s = "abcde"; t = p; // assign a string • s[2] = '9'; // assign a character

  22. String Operations(Get Sub-string) • Get part of a string: • Pascal: use copy() function • s := copy( p, 3, 5 ); { s = '12#ab' } • C: use strncpy() function • strncpy( s, &p[2], 5 ); // s == "12#ab" • You have to use some pointer stuff in C! • C++: use string.substr() function • Note the 2 usages of string.substr() • s = p.substr( 2, 5 ); // s == "12#ab" • s = p.substr( 3 ); // s == "2#ab12"

  23. String Operations (Get Length) • Get length of string: • Pascal: use length() function • k := length(s); { k = 5 } • c := s[0]; { c = chr(5) } • s[0] gives the length in char data type. • C: use strlen() function • k = strlen(s); // k == 5 • C++: use string.length() function • k = s.length(); // k == 5

  24. String Operations (Concatenation) • Concatenate strings together: • Pascal: use concat() function or + operator • s := concat( s, t, u ); { s = 'abcde12c' } • s := s+t+u; { concat by “+” } • They are totally equivalent. • C: use strcat() function • strcat(s,t); strcat(s,u); // concat one by one • C++: use + operator • s = s+t+u; // normal way • s += t+u; // another way

  25. String Operations (Comparison) • Compare strings: • Compare the characters one by one until not equal or end of string. • Pascal: use >, <, >=, <=, =, <> • s > p { true } • C: use strcmp() function • =0:equal, >0:greater, <0:smaller • k = strcmp(s,p); // k > 0 • C++: use >, <, >=, <=, ==, != • s > p // true

  26. String Operations (Insertion) • Insert another string into a string: • Pascal: use insert() procedure • insert( s, t, 3 ); { s = 'ab12cde' } • C: use strcpy() and strncpy() functions • I can’t find such a function. Is there any? • May use a few strcpy() and strncpy() instead. • C++: use string.insert() • s.insert( 2, t ); // s == 'ab12cde' • More overloaded .insert() for you to discover!

  27. String Operations (Deletion) • Delete a part from a string: • Pascal: use delete() procedure • delete( s, 2, 3 ); { s = 'ae' } • C: use strcpy() and strncpy() functions • I can’t find such a function. Is there any? • May use a few strcpy() and strncpy() instead. • C++: use string.erase() • s.erase( 1, 3 ); // s == 'ae' • More overloaded .erase() for you to discover!

  28. String Operations(Search Sub-string) • Find a string from a string: • Pascal: use pos() function • k := pos( '12', p ); { k = 3 } • k := pos( 'x', p ); { k = 0 for not found } • C: use strstr() and pointer operations • r = strstr( p, "12" ); // returns &p[2] • k = r – p; // k == 2 • r = strstr( p, "x" ); // returns null, r == 0 • C++: use string.find() function • k = s.find( "12" ); // k == 2 • s.find( "x" ) == string::npos // true

  29. String Operations(Data Conversion) • Data conversion: • Pascal: use val() and str() procedures • val( t, k, e ) { e is an integer for error } • str( k, t ) { error-free to convert k to t } • C: use atoi(), atof(), sscanf(), sprintf(), ... • k = atoi( t ); // k == 12, atof() for double • Remember to #include <stdlib.h> or <cstdlib> • sscanf( t, "%d", &k ); // similar to scanf() • sprintf( s, "%d", k ); // similar to printf() • Remember to #include <stdio.h> or <cstdio> • C++: use string.c_str() or <sstream>

  30. String Operations (Misc) • Misc: • Pascal • copy( s, 1, 1 ) <> s[1] { although both are 'a' } • copy( s, 1, 1 ) is a string, s[1] is a char. • Result truncated if length exceeded. • PChar is a pointer to a null-terminated string in heap memory in Turbo Pascal. • AnsiString is a null-terminated string of unlimited length that all original string functions can be applied. • var t : ansistring; • Only available in Free Pascal.

  31. String Operations (Misc) • Misc: • C • Be careful not to overwrite the '\0'! • C++ • s.c_str() gives the C string equivalent of s. • Find more fancy stuff yourself!

  32. DaP Techniques (Part I) • Read the question CAREFULLY. • DaP questions are often very complicated and sometimes misleading. Have a clear mind! • Think how you can process the input to give the output. • Algorithm • Decide how the input should be stored. • Data structure

  33. DaP Techniques (Part I) • Annoying things in DaP • Lengthy and problem description • Annoying and frustrating • Dirty input • Mixing strings and numbers • Unknown length of data • Complicated processing part • Dirty output • Strange format

  34. DaP Techniques (Part I) • Algorithm and data structure cooperate with each other. • Think both at the same time. • Consider the complexity: • Run time complexity • Memory complexity • Big-O notation is usually used. • Again, will be taught in the future. • Trying to consider the no. of operations in the worst case is acceptable for today.

  35. DaP Techniques (Part I) • Think like a computer! • Do NOT be afraid of using lots of loopings, if-then-else’s, multi-dimensional arrays, ... • Get used to write programs with many levels of nesting NOW. • Important because you WILL have to write such programs throughout your OI training (and all other programming tasks).

  36. DaP Techniques (Part I) • Typical structure of the main body of a program: • Input • Processing • Output • Follow the IPO module can make your program more structural. • However, this is not a must.

  37. DaP Techniques (Part I) • Processing: • Input is usually stored for processing later. • But sometimes the values are used only once. • Then we may save time and memory by processing the data while inputting. • In the extreme case, input, processing and output may be mixed together.

  38. DaP Techniques (Part I) • OI style is often used in OI programs. • Short variable names. • No need to have meaning. • Usually consists of only 1 character. • No need to write comments. • Write everything in the main program, especially no procedures/functions for things that will only be executed for once. • Will be better taught by Unu in the future.

  39. DaP Techniques (Part I) • BUT OI style doesn’t mean unstructured. • Remember, DaP programs are often so complicated. You will get lost if your program is unstructured. • Your program must be maintainable by yourself, at least during the competition! • Appropriate use of procedures and functions. • Good indentation and suitable comment. • Consistent variable/function naming scheme. • May even use OOP technique!

  40. Exercises (AM) • Let’s do easy things before lunch break! • 30189 – Minesweeper • Example for the class. • 1005 – Napster Cheating • 1006 – Octopus • 30300 – Ecological Premium • 1015 – Parentheses Balance • Will be explained in Data Structure class. • But you may try!

  41. Exercises (PM) • Let’s see some harder examples! • 2031 – Narrow Range Broadband • 2030 – Be Jeweled! • Related to DFS/BFS. Will be explained in future. • But you may try! • 20413 – Up and Down Sequences • 10000 – Somebody Save the Prince • Your DaP ultimate challenge!

More Related