240 likes | 342 Views
Templates WTF?. [ ]. Available online at: http://www.cs.washington.edu/people/acm/tutorials Prepared for Bart Niswonger’s CSE 326 (Data Structures) class, Su ’01 – Albert Wong (awong@cs). The immortal question: Why?. What exactly are templates for, and why learn them?.
E N D
TemplatesWTF? [ ] Available online at: http://www.cs.washington.edu/people/acm/tutorials Prepared for Bart Niswonger’s CSE 326 (Data Structures) class, Su ’01 – Albert Wong (awong@cs)
The immortal question: Why? What exactly are templates for, and why learn them? • Limited Generic Programming (polymorphism) • Some functions have the same semantic meaning for some (if not all) data types. For instance, a function print() should display a sensible representation of anything passed in. Ideally, it shouldn’t need to be rewritten for each possible type. • Less repetitive code • Code that only differs in the data type it handles does not have to be • rewritten for each and every data type you want to handle. It’s easier to • read and maintain since one piece of code is used for everything • To be a 1337 CXX H@x0R • Come on. You know you want to be one…and templates are the just • sort of obscure C++ thing that C++ H@x0rs must know about. (That and • user defined casts, mutable data members, etc. etc.)
Example: a swap function Problem:Oftentimes, it is nice to be able to swap the values of two variables. This function’s behavior is similar for all data types. Templated functions let you do that – in most cases without any syntax changes. Stupid method – write an overloaded function for each type Swap for integers Swap for an arbitrary type T void swap(int &a, int &b) { int c = a; a = b; b = c; } void swap(T &a, T &b) { T c = a; a = b; b = c; } Template method – write one templated function This function can be used with any type that supports assignment and can be passed in as a non-const reference. MSVC provides you this function. It’s not standard though. (Don’t use it! g++ doesn’t provide it!) template <typename T> void swap(T &a, T &b) { T c = a; a = b; b = c; }
Template Syntax: swap dissected The template<…> line states that everything in the following declaration or definition is under the subject of the template. (In this case, the definition is the function swap) In here goes a list of “placeholders variables.” In almost all cases, they will be specified with either the typename or class keywords. These two keywords are equivalent. Template behavior: Like most things in C++, templates attempt to behave as if it were a natural part of the language. It fails. There are two ways to use templates, implicit and explicit specialization. Explicit specialization always works. Implicit sometimes works. Why ever use implicit specialization? It’s much cleaner. What are they? That is explained on the next slide. template <typename T> void swap(T &a, T &b) { T c = a; a = b; b = c; } “Placeholder variables” have one value within each template declaration. Think of them as being replaced by whatever type you specify the template to be.
Template Syntax: Using it Using a template template <typename T> void swap(T &a, T &b) { T c = a; a = b; b = c; } To use a template, one has to specialize it. This is why it isn’t quite a generic function. It does static polymorphism. It morphs itself to the right type during preprocess time (explained later!). Syntax To explicitly specialize a template, write its name with the arguments for the placeholder variables in angle brackets. This method always works. Example: double d1 = 4.5, d2 = 6.7; swap<double>(d1, d2); Templates however can auto-sense its placeholder values if all information about what the placeholders represent can be inferred from its context (arguments, and for member functions, the associated class instance). This is called implicit specialization. In the previous case, the compiler is smart enough to figure out that T is a double even without the explicit <double> since the arguments are doubles. Thus this shorthand works: Example: swap(d1, d2);
How they Work: Compilation 098 Preprocessor Libraries Resolves #define, #include, comments, templates Source code (text) .c .h .c .h Compiler Preprocessor Translates code to Machine Language. It outputs an “object file.” This code is not executable C/C++ code (text) Compiler Object code (bin) Linker Linker Takes object files and resolves references to functions and variables in different files Native Executable (bin) executable In MSVC, all these steps are hidden behind a button with a red exclamation mark. This is bad. A compiler has at least 2 portions, and in C++ it has 3 portions. When you build an executable from a C++ source file, the preprocessor removes all the things listed under “Preprocessor.” The result is pure C++ code (no comments, templates, #includes, etc). This code is then compiled and linked. All good programmers understand this process well.
How they Work: Compiler, Linker What does it mean to compile? • The term “compile” is somewhat ambiguous. Often, when people say “compile” they mean “build.” In the formal sense, it means turning one language into another language. With C++, this generally means turning C++ source code into Machine Code. • Each C++ source file is usually compiled into an object file that contains the code of all its defined functions. At this point, if you call a function from a library or another file, the object code (stuff in the object file) only says “this function exists somewhere and I want to use it” • This if formally what the compiler is. When you get syntax errors, this is usually the compiler talking to you (as opposed to the linker or preprocessor). What does it mean to link? • After compilation happens, all the object files need to be linked together into a final executable. All “I want this function” stubs in the object files have to actually be resolved to some block of machine code and then the resulting executable has to be formatted in a way the operating system can understand. • If you write a prototype for a function, but forget to define it, your code will compile but it won’t link. Link errors are usually harder to track, as the linker can’t always give you line numbers (the linker only looks at the object files and knows nothing about the original source.
How they Work: Preprocessor What is the preprocessor? • The preprocessor, formally, deals with the directives that start with a # sign (like #include, #define). However, here, the term will be used to mean everything that happens before the compiler gets the stuff to turn into machine code. • The relevant (in relation to templates) things the preprocessor does are: • Replaces all #include statements and with the files they refer to. • Removes all comments. • Replaces all #defines macros with their value • Generates actual code from templates Templates do not exist! (there is no spoon) • Templates are a preprocessor construct. They are cookie-cutters with which the preprocessor generates real C++ code. When a template is used, (that is, specialized, implicitly or explicitly), it get instantiated. • The instantiation tells the preprocessor to create a version of the template where each placeholder is replaced by its specialization. At this point, the specific version of the template comes into existence and can be compiled. It does not exist otherwise! • In a very real way, a template just does a search and replace for each type you specialize the template for. In essence, you are doing the same as writing a bunch of overloaded functions. It’s just done for you, behind your back.
How they Work: Consequences Problem: Templates are resolved in the preprocessing stage, so they don’t exist to the compiler until they get instantiated. This is the balance between trying to make templates work transparently, and trying to make things efficient. As usual, C++ fails and just makes things inconsistent. • Effects: • Template code will not get compiled until used (and thus instantiated). Thus, the compiler will not catch syntax errors until the template is used. • A specialization (a place you actually use the template) instantiates all relevant templates before it. If a template appears after a specialization, it doesn’t get instantiated, and thus, it is not compiled. Conclusion: To make things work, all relevant template definitions must appear before at least one specialization. Otherwise, parts of the template will not get instantiated and compiled. Fixes: There are ways to deal with this. They will be introduced later. But first, let us journey to the pits of hell…err…I mean templated classes
Class Templates: Class Definition Syntax: Templated classes basically follow same syntax as templated functions. However, the rules for which templated classes can infer their specialization (see Template Syntax) are a bit more convoluted. Before moving on, a bit of review on templated functions: Are the following two templates equivalent? template <typename T> void swap(T &a, T &b) { T c = a; a = b; b = c; } template <class C> void swap(C &a, C &b) { C c = a; a = b; b = c; } Answer: Yes, they are equivalent. This may be relevant when writing class templates as it is possible that a situation may arise where two definitions are written for the same thing. If this happens, the program will not build since there are two equivalent function definitions. The name of the placeholder doesn’t matter, and “typename” and “class” can be used interchangeably. Just something to remember.
Class Templates: Example Example: A templated, dynamic, 2 dimensional array (Matrix)* #ifndef MATRIX_H #define MATRIX_H template <typename T> class Matrix { public: Matrix(int rows, int cols); Matrix(const Matrix &other); virtual ~Matrix(); Matrix& operator=(const Matrix &rhs); T* operator[](int i); int getRows() const; int getCols() const; protected: void copy(const Matrix &other); private: Matrix(); int m_rows; int m_cols; T *m_linArray; }; #endif /* MATRIX_H */ Notice the only addition to the class definition is the line: template <typename T> Within the the definition block, the placeholder has can be used as a data type. When the template is specialized, it takes on the value of the specialization. The header is pretty pedestrian. Let’s have some fun. On to the fiery pits of class implementation. File: Matrix.h *A commented version of this code is provided separately. It wouldn’t fit on the slide.
Class Templates: Example cont’d #include "Matrix.h" template <typename T> Matrix<T>::Matrix() {} template <typename T> Matrix<T>::Matrix(int rows, int cols) { m_rows = rows; m_cols = cols; m_linArray = new T[m_rows * m_cols]; } template <typename T> Matrix<T>::Matrix(const Matrix &other) { copy(other); } template <typename T> Matrix<T>::~Matrix() { delete[] m_linArray; } template <typename T> Matrix<T>& Matrix<T>::operator=(const Matrix &other) { if( this != &other ) { delete[] m_linArray; copy(other); } return *this; } template <typename T> T* Matrix<T>::operator[](int i) { return m_linArray + (i*m_cols); } template <typename T> void Matrix<T>::copy(const Matrix &other) { m_rows = other.m_rows; m_cols = other.m_cols; int size = m_rows * m_cols; m_linArray = new T[size]; for( int i=0; i < size; i++ ) { m_linArray[i] = other.m_linArray[i]; } } template <typename T> int Matrix<T>::getRows() const { return m_rows; } template <typename T> int Matrix<T>::getCols() const { return m_cols; } The next slide explains all this. It wouldn’t fit on this slide. File: Matrix.cc
Class Templates: Member Functions Dissected Again, a templated class name by itself has no meaning (eg. Matrix by itself means nothing). It only gets meaning through specialization, explicit or implicit. Thus, when referring to an instance of a templated class (a specific specialization), the class name must be explicitly specialized. Here, the template has been implicitly specialized by its context. It is within the specialization region of the class scope. Thus it does not need the template arguments. For a class definition, the specialization region is the class block. template <typename T> Matrix<T>& Matrix<T>::operator=(const Matrix &other) { if( this != &other ) { this->~Matrix(); copy(other); } return *this; } This may be obvious, but remember that though constructors and destructors have the same name as a the class template, they are functions and do not need to be specialized. Notice that the specialization region does not include the return type. Thus the return type needs explicit specialization specialization region of Matrix<T>::
Class Templates: Dark Arts (usage) Syntax • Templated classes must be explicitly specialized. Thus, to create a 2 dimensional Matrix of doubles using the last example, the syntax would be: • Matrix<double> m(3,3); • This specialization during declaration in reality creates a new type – namely Matrix<double>. This should be thought of as its own type, separate from any other specialization of Matrix (so it is different from Matrix<int>, or Matrix<Foo>, etc.) At this point, the instance behaves as any other instantiated type – at least for compilation. Now that you have the basics… So, young Jedi, you want to become a template master. You think you know all the tricks. Little do you know, your programming life now hangs in the balance. For what comes next is the real lesson. The coming information is what makes people shiver in fear when the word template is mentioned. Listen to what is presented next closely lest the subtleties stay shrouded in shadows. Now is time to learn of convention, for once you break from convention, you enter the dark side of template hell, and forever shall it dominate your path… Mu ha ha ha
Shotgun Safety: Danger Awareness (Identifying the danger) Problem Templates do not exist until you use them. They must be instantiated. Unless this is done explicitly, instantiation occurs at the first usage for all known template definitions. Thus, consider this example. Compile with g++ -Wall –ansi main.cc Matrix.cc Looks innocent, but it won’t link. /* main.cc */ #include <iostream> using namespace std; #include “Matrix.h” int main(void) { Matrix<int> m1(3,4); cout << m1.getRows() << endl; } Quiz: What won’t link and why? • The link error happens with m1.getRows() • Nothing from a template gets instantiated until it is either used or explicitly instantiated. • Matrix<int>::getRows() constdoes not get created until it is used at the line with m1.getRows(). • The definition of the function is in Matrix.cc and never used there. • Thus the definition never gets created and compiled to object code. Note: The compile line is actually wrong! The file Matrix.cc only contains template code. Since it is never used, it never generates object code and shouldn’t be compiled.
Shotgun Safety: 3 conventions (know the routines) There are three conventions to avoiding the link problem • Write all the code inline in the .h files. • Do the same as above, but kind of fake it by writing an implementation file with your implementation and #include the implementation file in your header file. • Write the template as you would a normal class (using a header and an implementation file). Then create a new source file and #include the template implementation file there. This is the file which you compile, not the template implementation. (See next slide for example) The first two methods have the problem that anytime an implementation of a function is changed, all code that uses it must be recompiled (not just relinked). This is very slow on large builds. Also, the build process will instantiate the template many more times than necessary which is a waste of time and space. The third method is free from such problems. It also avoids some other hurdles since it forces the instantiation of everything at one point.
Shotgun Safety: An example (faithfully practiced, prevents accidental loss of feet) • The proper procedure • Write the template, separated into a header and an implementation file • Create an instantiation file for the template which include the implementation file. • Compile the instantiation file and not the template implementation file. • The instantiation file generates the object code for the template. /* main.cc */ #include <iostream> using namespace std; #include <Matrix.h> int main(void) { Matrix<int> m1(3,4); cout << m1.getRows() << endl; } Example: To make the previously unlinking piece of code link properly, it is necessary to instantiate an integer version of the Matrix template. The file would simply look like this notice that the implementation (not header) file is included. /* MatrixInst.cc */ #include “Matrix.cc” template Matrix<int>; This line forces the instantiation of the Matrix class template, as well as all its member functions, for specialization int. Other specializations require their own lines. compile line: g++ –Wall –ansi main.cc MatrixInst.cc
Poison detection: Pop Quiz (be aware, so you know before it’s too late) Will this compile? /*Foo.h */ #ifndef FOO_H #define FOO_H template <typename T> class Foo { Foo() { b = “Hello Mom!”; } }; #endif /* FOO_H */ • The unfortunate answer is yes, it will compile. Even though b is undeclared, it will “compile” because nothing actually instantiates the template, so the compiler never sees the template code. • This is why not explicitly forcing the instantiation of class templates is dangerous. You won’t know about an error until you use it. Because of this, some people believe that it is better to write a non-templated version of a class first, and then make a template from that. Some beliefs, however, are just wrong. That is one of them. If templates are done properly with forced instantiation (see previous slide), then this scenario will not occur.
Poison detection: C++ Sanity (avoid mercury and bad C++ code) At this point, most of the important concepts of templates have been covered, barring static data members and inheritance. Those are fairly simple extensions of the ideas present and are left up to the reader. The rest of these slides will focus on C++ quirks and pitfalls specifically dealing with Microsoft Visual Studio 6.0 and g++. C++ and general programming sanity: • Be anal. With all class design, especially templates, strict C++ style will help reduce debugging time. It will save you time later. • Use const – especially with references! Don’t pass classes by value. This pulls in a copy constructor overhead. Pass a const reference. They have the same amount of safety. Also, passing in straight literals (like the number 1) should only be done by const reference or by copy. • Only expose as much implementation as have to. If you use an array as the internal data structure for a queue, there should be no function that returns a raw pointer to the array. There should also be no member function that gives random access to the elements. Only an enqueue and a dequeue function should probably be there. • Don’t mix input, output, and interpretation if at all possible. • KISS – Keep it simple stupid.
Poison detection: MSVC and g++ quirks (be wary of the quirky one…) Microsoft Visual Studio and g++ are both slightly non-standard in their own fashions. MSVC tends to be a whole lot more broken in general however. If you need to develop code for one, do it on that platform from start to end. Unless you know exactly how to compensate for the quirks, the bugs in your program will be very hard to find when you port. Do not port. CSE 326 students are notorious for writing broken code in MSVC and then not finishing their projects because when they port to g++, they find errors everywhere in their code. Remember, at this level, it is rarely the compiler’s fault (except for MSVC being too lax). Basic MSVC quirks: • The for-loop scope is broken in MSVC. Make sure you get it right. If you do for(int i=0; i < 10; i++) { }, on any decent compiler, the ‘i’ ceases to exist after the ending brace of the for loop. • The STL implementation on MSVC, and the version in MSDN help is very wrong. Find some other reference. • There are many functions in MSVC and MSDN that don’t exist in standard C++. Make sure MSDN says the function is ANSI complaint.
Poison detection: g++ (a good detector helps) g++ quirk: • The fully compliant c++ library is still in development, so currently, there is a hybrid version. In g++, the standard c++ libraries are not wrapped in namespace std. This means that not having the using directive (“using namespace std”) doesn’t matter. However, still do it since this will change in the near future. g++ hints: • use the –Wall flag. This enables all warnings so that g++ catches more errors. Remember, the harsher the compiler, the more likely the resulting program will run okay. (Read Dune, by Frank Herbert) • use the –ansi flag. This ensures ansi compatibility. Use this because it will disable almost all g++ extensions to c++ and make your code more portable • pipe the result into less (if using bash, compile command line should look something like: g++ -Wall -ansi *.cc 2>&1 | less) This will let you scroll up and down all the error messages piped out. With templates, you’ll probably need it.
Compilation woes: g++ The errors g++ have a different phrasing from those of MSVC. Here’s some hints on how to decipher the ones regarding templates. This is generated by the line: cout << f; where f is an instance of foo. test.cc:7: no match for `_IO_ostream_withassign & << Foo &' /usr/include/g++-2/iostream.h:77: candidates are: ostream::operator <<(char) /usr/include/g++-2/iostream.h:78: ostream::operator <<(unsigned char) /usr/include/g++-2/iostream.h:79: ostream::operator <<(signed char) … Things that look like this are compile errors. g++ tries to be helpful by providing possible functions that may match what you wanted. This generally just clutters up the screen. This means it couldn’t find the object code for the function. Things that look like this are link errors. /tmp/ccYZzGqm.o: In function `main': /tmp/ccYZzGqm.o(.text+0x3e): undefined reference to `void swap<int>(int &, int &)' collect2: ld returned 1 exit status Notice that this is a templated function, and that it has a specialization. The compiler and linker will only ever deal with specialized templates.
Compilation woes: template errors Templates generate very long errors. This error was generated by misusing the STL vector class. It is a simple syntax error, but without some experience, deciphering the error is hard. Read the error forwards for a ways and then backwards. Most of the middle, you can ignore. What you want is the line number, error and roughly what the function is called. The rest of the template specialization stuff is generally useless. The important portions are bolded. test.cc: In function `int main()': test.cc:17: no matching function for call to `vector<basic_string<char,string_char_traits<char>,__default_alloc_template<true,0> >,__default_alloc_template<true,0> >::push_back (int)' /usr/include/g++-2/stl_vector.h:144: candidates are: vector<basic_string<char,string_char_traits<char>,__default_alloc_template<true,0> >,__default_alloc_template<true,0> >::push_back<string, alloc>(const string &) . In this case, I created a vector of type string (vector<string>) and then tried to call the push_back function with an integer. Hence it says on line 17 of test.cc: “no matching function for call to vector<…>::push_back(int) candidates are vector<…>push_back(const string&)” *note: MSVC complains if your template name gets beyond 255 characters. They’re internal libraries sometimes do this. Why? Because it’s stupid.
Compilation woes: template errors cont’d Templates have to be instantiated. They don’t get checked for syntax errors until they are instantiated. When an error is found in the syntax of a template, the line of where it is instantiated is given first. Then the line of the syntax error is given. You usually don’t care too much about the first line. template <typename T> class Foo { Foo() { b = “Hello Mom!”; } }; int main(void) { Foo f; return 0; } This is the actual error test.cc: In method `Foo<int>::Foo<int>()': test.cc:10: instantiated from here test.cc:5: `b' undeclared (first use this function) test.cc:5: (Each undeclared identifier is reported only once test.cc:5: for each function it appears in.) . This line states where the instantiation happened. It does not state what the error is. Thus ends the crash course through templates. Templates are very powerful, and one one gets used to them, they are very nice. Hopefully this helps clear some things up.