1 / 20

A couple of slides on containers…

A couple of slides on containers…. Federico Carminati Offline Week 10-05. Generalities. Purpose of a container is to hold several instances of similar information Elements in a container are accessed via an index, an iterator or both Three kind of containers will be considered

esturgis
Download Presentation

A couple of slides on containers…

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A couple of slides on containers… Federico Carminati Offline Week 10-05

  2. Generalities • Purpose of a container is to hold several instances of similar information • Elements in a container are accessed via an index, an iterator or both • Three kind of containers will be considered • “C-style” arrays • ROOT containers • STL containers

  3. C-style containers #include <stdio.h> void cont1 () { struct point { Float_t x; Float_t y; }; point P[100]; printf("Sizeof P = %ld\n",sizeof(P)); } root [5] .x cont1.C++ Sizeof P = 800

  4. C-style containers • Advantages • Minimum size overhead • Fast access (direct and sequential) • Very clear semantics • Drawbacks • Lack of “encapsulation” • Minimal functionality (I/O, browsing…) • Fixed dimension • No safety against out-of-bounds addressing • Where to use • Data structures within algorithms • Where to avoid • For dynamic data structures • When I/O and inspection are required • For publicly accessible data (i.e. outside a single method)

  5. Array of classes #include <stdio.h> #include <TObject.h> void cont1 () class Cpoint : public TObject { public: Cpoint() {} ~Cpoint() {} Float_t X() const {return fX;} Float_t Y() const {return fY;} private: Float_t fX; Float_t fY; }; Cpoint CP[100]; printf("Sizeof CP = %ld\n",sizeof(CP)); } root [7] .x cont1.C++ Sizeof CP = 2000

  6. Array of classes • Advantages • Fast access (direct and sequential) • Very clear semantics • Full “ROOT” functionality (I/O, browsing) • C++ “encapsulation” • Drawbacks • 12 bytes overhead per object • Fixed dimension • No safety against out-of-bounds addressing • Where to use • Data structures within algorithms • Class members with fixed dimensions • Where to avoid • For dynamic data structures • When objects are small

  7. Classes of arrays #include <stdio.h> #include <TObject.h> void cont1 () { class CApoint : public TObject { public: CApoint() {} ~CApoint() {} Float_t X(Int_t i) const {return fX[i];} Float_t Y(Int_t i) const {return fY[i];} private: Float_t fX[100]; Float_t fY[100]; }; CApoint CAP; printf("Sizeof CAP = %ld\n",sizeof(CAP)); } root [20] .x cont1.C++ Sizeof CBP = 812

  8. Classes of arrays #include <stdio.h> #include <TObject.h> void cont1 () { class Cpoint : public TObject { public: Cpoint() {} ~Cpoint() {} Float_t X() const {return fX;} Float_t Y() const {return fY;} void Set(Float_t x, Float_t y) {fX=x; fY=y;} private: Float_t fX; Float_t fY; }; class CBpoint : public TObject { public: CBpoint() {} ~CBpoint() {} void GetPoint(Cpoint &p, Int_t i) const {p.Set(fX[i], fY[i]);} private: Float_t fX[100]; Float_t fY[100]; }; CBpoint CBP; printf("Sizeof CBP = %ld\n",sizeof(CBP)); }

  9. Classes of arrays • Advantages • Fast access (direct and sequential) • Very clear semantics • Full “ROOT” functionality (I/O, browsing) • C++ “encapsulation” • Possibility to add your own memory management • Low overhead (12 bytes for the whole array!) • Drawbacks • “Roll-your-own” management of dynamic dimensions • No safety against out-of-bounds addressing • Where to use • Class members with fixed dimensions • Where to avoid • For highly dynamic data structures

  10. ROOT containers

  11. ROOT containers I - TObjArray #include <stdio.h> #include <TObject.h> #include <TObjArray.h> void cont2 () { class Cpoint : public TObject { public: Cpoint() {} ~Cpoint() {} Float_t X() const {return fX;} Float_t Y() const {return fY;} void Set(Float_t x, Float_t y) {fX=x; fY=y;} private: Float_t fX; Float_t fY; }; TObjArray CP(100); for (Int_t i=0; i<100; ++i) CP[i]=new Cpoint(); }

  12. ROOT containers I - TObjArray • Advantages • Fast direct access, sequential may be slower • Polymorphic container • Full “ROOT” functionality (I/O, browsing) • C++ “encapsulation” • Fully automated dynamic management • Overhead is 40+<n>*4 bytes • Drawbacks • Have to use TObjects, with their own overhead • Object creation is expensive • Object ownership has to be handled carefully to avoid leaks • Where to use • Dynamic data structures with direct access • Need for polymorphism • Where to avoid • Where the above conditions are not verified • When you need to recreate objects frequently

  13. ROOT containers II - TClonesArray #include <stdio.h> #include <TObjArray.h> #include <TClonesArray.h> #include <TStopwatch.h> void cont3 (Int_t nrep) { class Cpoint : public TObject { public: Cpoint() {} ~Cpoint() {} Float_t X() const {return fX;} Float_t Y() const {return fY;} void Set(Float_t x, Float_t y) {fX=x; fY=y;} private: Float_t fX; Float_t fY; }; const Int_t size=20000; TStopwatch t; t.Start(); TObjArray a(size); for(Int_t i=0; i<nrep; ++i) { for(Int_t j=0; j<size; ++j) a[j]=new Cpoint(); a.Delete(); } t.Print(); t.Reset(); t.Start(); TClonesArray b("Cpoint", size); for(Int_t i=0; i<nrep; ++i) { for(Int_t j=0; j<size; ++j) new(b[j]) Cpoint(); b.Clear(); } t.Print(); } root [23] .x cont3.C++(1000) Real time 0:01:11, CP time 62.600 Real time 0:00:22, CP time 20.230

  14. ROOT containers II - TClonesArray • Advantages • Fast direct and sequential access • Polymorphic container • Full “ROOT” functionality (I/O, browsing) • C++ “encapsulation” • Fully automated dynamic management • Overhead is 48+<n>*8 bytes • Very cheap object creation • Drawbacks • Have to use TObjects, with their overhead • Array owns the objects • Where to use • Dynamic data structures with direct access which are recreated several times • Need for polymorphism • Where to avoid • Where the above conditions are not verified • When you do not need to recreate objects frequently

  15. Trees • Trees are not containers • Trees simulate containers for collections of similar objects written on a file • When the collection is small, it is convenient to read it all in memory • When it is large, Trees give you the “look and feel” of a container in memory with a sophisticated “behind your back” management of I/O • Trees have a very nice “player” interface that you do not have for normal containers • Unless you implement it!

  16. Maps #include <map> #include <iostream> struct ltstr { bool operator()(const char* s1, const char* s2) const { return strcmp(s1, s2) < 0;} }; void cont4() { map<const char*, int, ltstr> months; char *mname[12]={"january", "february", "march", "april", "may", "june", "july", "august", "september", "october", "november", "december"}; int days[12]={31,28,31,30,31,30,31,31,30,31,30,31}; for (int i=0; i<12; i++) months[mname[i]]=days[i]; cout << "june -> " << months["june"] << endl; map<const char*, int, ltstr>::iterator cur = months.find("june"); map<const char*, int, ltstr>::iterator prev = cur; map<const char*, int, ltstr>::iterator next = cur; ++next; --prev; cout << "Previous (in alphabetical order) is " << (*prev).first << endl; cout << "Next (in alphabetical order) is " << (*next).first << endl; } june -> 30 Previous (in alphabetical order) is july Next (in alphabetical order) is march

  17. Maps • Map is a Sorted Associative Container that associates objects of type Key with objects of type Data • Map is a Pair Associative Container, meaning that its value type is pair<const Key, Data> • It is also a Unique Associative Container, meaning that no two elements have the same key • Map has the important property that inserting a new element into a map does not invalidate iterators that point to existing elements • Erasing an element from a map also does not invalidate any iterators, except, of course, for iterators that actually point to the element that is being erased

  18. Maps • Advantages • Fast direct direct access and sequential access (but no indexing) • Supported by ROOT • Fully automated dynamic management (see before) • Drawbacks • Large overhead (I could not calculate it EXACTLY, but it includes a hash table) • Using “AliRoot-forbidden STL’s” • Where to use • Need to access quickly data with non-integer keys • Where to avoid • Where you do NOT desperately need the above • Where you can use TMap • For integer keys the overhead of producing a hash table is massive and unjustified -- you are using a bazooka to kill a fly!

  19. … and if I had more time … • I would have told you about all the rest • … but

  20. Conclusion • It might be tempting to use the “most functional” container to do the job • Functionality comes at a cost • AliRoot is already too slow and too big to afford this • So please use a judicious blend of brain and the simplest collection that does the job • Don’t delude yourself with 10-lines benchmarks they can be tuned to provide any result with a bit of skill

More Related