1 / 52

Object-Oriented Reengineering Patterns and Techniques

Object-Oriented Reengineering Patterns and Techniques. Wahyu Andhyka Kusuma, S.Kom kusuma.wahyu.a@gmail.com 081233148591. M ateri 5 Problem Detection. Topik. Metrics Object-Oriented Metrics dalam Praktek Duplikasi k ode. Topik. Metrics Kualitas dari Perangkat Lunak

duscha
Download Presentation

Object-Oriented Reengineering Patterns and Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Object-Oriented Reengineering Patterns and Techniques Wahyu Andhyka Kusuma, S.Kom kusuma.wahyu.a@gmail.com 081233148591 Materi 5 • Problem Detection

  2. Topik • Metrics • Object-Oriented Metrics dalam Praktek • Duplikasi kode

  3. Topik • Metrics • Kualitas dari Perangkat Lunak • Menganalisa Kecenderungan • Object-Oriented Metrics dalam Praktek • Duplikasi kode

  4. Mengapa menggunakan OO dalam Reengineering? • Menaksir kualitas dari perangkat lunak • Komponen mana yang memiliki kualitas yang buruk? (sehingga dapat di reengineering) • Komponen yang mana memiliki kualitas yang baik? (sehingga dapat di reverse engineered)  Metrics sebagai peralatan untuk reengineering • Mengontrol proses dari reengineering • Menganalisa kecenderungan : • Komponen mana yang bisa diubah?? • Bagian refactoring mana yang dapat digunakan?  Metrics sebagai peralatan reverse engineering!

  5. ISO 9126 Quantitative Quality Model Functionality Error tolerance Reliability Accuracy defect density Efficiency = #defects / size Software Consistency Quality Usability correction time Simplicity Maintainability correction impact Modularity Portability = #components changed ISO 9126 Factor Characteristic Metric

  6. Product & Process Attributes Process Attribute Product Attribute Definisi : Mengukur aspek dari Definisi : Mengukur aspek dari Proses dimana memproduksi produk Hasil yang dikirimkan ke pelanggan Contoh : waktu untuk memperbaiki, Contoh : Jumlah dari sistem kerusakan jumlah dari komponen Yang dirubah per perbaikan Yang rusak, mempelajari tentang sistem

  7. External & Internal Attributes Internal Attribute External Attribute Definisi : mengukur didalam Definisi : mengukur bagaimana Istilah didalam produk Memisahkan FORM, dalam konteks behaviour product/process berjalan dalam environment Contoh : class coupling dan Contoh : waktu rata-rata dalam cohesion, method size kesalahan, #components changed

  8. External vs. Internal Product Attributes

  9. Metrik dan Pengukuran • Weyuker [1988] mendefinisikan sembilan properti dimana Metrik software harus diambil • Untuk OO hanya 6 properti yang sangat penting [Chidamber94, Fenton & Pfleeger ] • Noncoarseness: • Diberikan sebuah Class P dan sebuak metrik m, kelas lain misal Q juga dapat ditemukan sehingga menjadi m(P)  m(Q) • Tidak semua kelas memiliki nilai yang sama untuk metrik • Nonuniqueness. • Dimana kelas P dan Q memiliki ukuran tetap sedemikian sehingga m(P) = m(Q) • Dua kelas dapat memiliki metrik yang sama • Monotonicity • m(P)  m (P+Q) dan m(Q)  m (P+Q), P+Q adalah “kombinasi” dari kelas P dan Q.

  10. Metrik dan Pengukuran • Design Details are Important • Inti utama dari Class harus mempengaruhi nilai dari metrik. Setiap class melakukan aksi yang sama dengan detailnya harus memberikan dampak terhadap nilai dari metrik. • Nonequivalence of Interaction • m(P) = m(Q)  m(P+R) = m(Q+R) dimana R interaksi dengan Class • Interaction Increases Complexity • m(P) + (Q) < m (P+Q). • Dimana dua class digabungkan, interaksi diantaranya juga akan menambah nilai dari metrik • Kesimpulan: Tidak semua pengukuran berupa Metrik

  11. Memilih Metrik • Cepat • Scalable: Kita tidak dapat menghasilkan log(n2) dimana n  1 juta LOC (Line of Code) • Tepat • (misalnya #methods — perhitungkan semua method, public, juga inherited?) • Bergantung pada kode • Scalable: Kita menginginkan mengumpulkan metrik dalam waktu sama • Sederhana • Metrik yang komplek sulit untuk diterjemahkan

  12. Menaksir kemudahan perbaikan • Ukuran dari sistem, termasuk entitas dari sistem • Ukuran Class, Ukuran method, inheritance • Ukuran entitas mempengaruhi maintainability • Kesatuan dari entities • Class internal • Perubahan harusnya ada dikelas tersebut • Coupling (penggabungan)diantara entitas • Didalam inheritance: coupling diantara class-subclass • Diluar inheritance • Strongcoupling mempengarui perubahan di kelas tersebut

  13. Inherit Class BelongTo Invoke Attribute Method Access Sample Size and Inheritance Metrics Class Size Metrics # methods (NOM) # instance attributes (NIA, NCA) # Sum of method size (WMC) Inheritance Metrics hierarchy nesting level (HNL) # immediate children (NOC) # inherited methods, unmodified (NMI) # overridden methods (NMO) Method Size Metrics # invocations (NOI) # statements (NOS) # lines of code (LOC)

  14. Sample class Size • (NIV) • [Lore94] Number of Instance Variables (NIV) • [Lore94] Number of Class Variables (static) (NCV) • [Lore94] Number of Methods (public, private, protected) (NOM) • (LOC) Lines of Code • (NSC) Number of semicolons [Li93]  number of Statements • (WMC) [Chid94] Weighted Method Count • WMC = ∑ ci • where c is the complexity of a method (number of exit or McCabe Cyclomatic Complexity Metric)

  15. Hierarchy Layout • (HNL) [Chid94] Hierarchy Nesting Level , (DIT) [Li93] Depth of Inheritance Tree, • HNL, DIT = max hierarchy level • (NOC) [Chid94] Number of Children • (WNOC) Total number of Children • (NMO, NMA, NMI, NME) [Lore94] Number of Method Overridden, Added, Inherited, Extended (super call) • (SIX) [Lore94] • SIX (C) = NMO * HNL / NOM • Weighted percentage of Overridden Methods

  16. Method Size • (MSG) Number of Message Sends • (LOC) Lines of Code • (MCX) Method complexity • Total Number of Complexity / Total number of methods • API calls= 5, Assignment = 0.5, arithmetics op = 2, messages with params = 3....

  17. Sample Metrics: Class Cohesion • (LCOM) Lack of Cohesion in Methods • [Chidamber 94] for definition • [Hitz 95] for critique Ii = set of instance variables used by method Mi let P = { (Ii, Ij ) | Ii  Ij =  } Q = { (Ii, Ij ) | Ii  Ij  } if all the sets are empty, P is empty LCOM = |P| - |Q| if |P|>|Q| 0 otherwise • Tight Class Cohesion (TCC) • Loose Class Cohesion (LCC) • [Bieman 95] for definition • Measure method cohesion across invocations

  18. Sample Metrics: Class Coupling (i) • Coupling Between Objects (CBO) • [Chidamber 94a] for definition, • [Hitz 95a] for a discussion • Number of other classes to which it is coupled • Data Abstraction Coupling (DAC) • [Li 93] for definition • Number of ADT’s defined in a class • Change Dependency Between Classes (CDBC) • [Hitz 96a] for definition • Impact of changes from a server class (SC) to a client class (CC).

  19. Sample Metrics: Class Coupling (ii) • Locality of Data (LD) • [Hitz 96] for definition LD = ∑ |Li | / ∑ |Ti | Li = non public instance variables + inherited protected of superclass + static variables of the class Ti = all variables used in Mi, except non-static local variables Mi = methods without accessors

  20. The Trouble with Coupling and Cohesion • Coupling and Cohesion are intuitive notions • Cf. “computability” • E.g., is a library of mathematical functions “cohesive” • E.g., is a package of classes that subclass framework classes cohesive? Is it strongly coupled to the framework package?

  21. Conclusion: Metrics for Quality Assessment • Can internal product metrics reveal which components have good/poor quality? • Yes, but... • Not reliable • false positives: “bad” measurements, yet good quality • false negatives: “good” measurements, yet poor quality • Heavyweight Approach • Requires team to develop (customize?) a quantitative quality model • Requires definition of thresholds (trial and error) • Difficult to interpret • Requires complex combinations of simple metrics • However... • Cheap once you have the quality model and the thresholds • Good focus (± 20% of components are selected for further inspection) • Note: focus on the most complex components first!

  22. Topik • Metrics • Object-Oriented Metrics dalam Praktek • Detection strategies, filters and composition • Sample detection strategies: God Class … • Duplikasi kode

  23. Detection strategy • A detection strategy is a metrics-based predicate to identify candidate software artifacts that conform to (or violate) a particular design rule

  24. Filters and composition • A data filter is a predicate used to focus attention on a subset of interest of a larger data set • Statistical filters • I.e., top and bottom 25% are considered outliers • Other relative thresholds • I.e., other percentages to identify outliers (e.g., top 10%) • Absolute thresholds • I.e., fixed criteria, independent of the data set • A useful detection strategy can often be expressed as a composition of data filters

  25. God Class • A God Class centralizes intelligence in the system • Impacts understandibility • Increases system fragility

  26. Feature Envy • Methods that are more interested in data of other classes than their own [Fowler et al. 99]

  27. Data Class • A Data Class provides data to other classes but little or no functionality of its own

  28. Data Class (2)

  29. Shotgun Surgery • A change in an operation implies many (small) changes to a lot of different operations and classes

  30. Topik • Metrics • Object-Oriented Metrics dalam Praktek • Duplikasi kode • Detection techniques • Visualizing duplicated code

  31. Kode di salin Contoh dariMozilla Distribution (Milestone 9) Diambil dari /dom/src/base/nsLocation.cpp

  32. Berapa banyak kode diduplikasi? Biasanya diperkirakan:8 hingga12% dari kode

  33. Apa itu duplikasi kode? • Duplikasi kode = Bagian dari kode program ditemukan ditempat lain dalam satu sistem yang sama • Dalam File yang berbeda • Dalam File sama tapi Method berbeda • Dalam Method yang sama • Bagian tersebut harus memiliki logika atau struktur yang sama sehingga dapat diringkas,

  34. Permasalahan dari duplikasi • Biasanya memberikan efek negatif • Penggelembungan kode • Efek negatif ketika perbaikan sistem atau software • Menyalin menjadi kerusakan tambahan dalam kode • Software Aging, “hardening of the arteries”, • “Software Entropy” increases even small design changes become very difficult to effect

  35. Mendeteksi duplikasi kode • Nontrivial problem: • No a priori knowledge about which code has been copied • How to find all clone pairs among all possible pairs of segments?

  36. General Schema of Detection Process

  37. Recall and Precision

  38. Simple Detection Approach (i) • Assumption: • Code segments are just copied and changed at a few places • Noise elimination transformation • remove white space, comments • remove lines that contain uninteresting code elements • (e.g., just ‘else’ or ‘}’) … //assign same fastid as container fastid = NULL; const char* fidptr = get_fastid(); if(fidptr != NULL) { int l = strlen(fidptr); fastid = newchar[ l + 1 ]; … fastid=NULL; constchar*fidptr=get_fastid(); if(fidptr!=NULL) intl=strlen(fidptr) fastid = newchar[l+]

  39. Simple Detection Approach (ii) • Code Comparison Step • Line based comparison (Assumption: Layout did not change during copying) • Compare each line with each other line. • Reduce search space by hashing: • Preprocessing: Compute the hash value for each line • Actual Comparison: Compare all lines in the same hash bucket • Evaluation of the Approach • Advantages: Simple, language independent • Disadvantages: Difficult interpretation

  40. A Perl script for C++ (i)

  41. A Perl script for C++ (ii) • Handles multiple files • Removes comments • and white spaces • Controls noise (if, {,) • Granularity (number of lines) • Possible to remove keywords

  42. Output Sample Lines: create_property(pd,pnImplObjects,stReference,false,*iImplObjects); create_property(pd,pnElttype,stReference,true,*iEltType); create_property(pd,pnMinelt,stInteger,true,*iMinelt); create_property(pd,pnMaxelt,stInteger,true,*iMaxelt); create_property(pd,pnOwnership,stBool,true,*iOwnership); Locations: </face/typesystem/SCTypesystem.C>6178/6179/6180/6181/6182 </face/typesystem/SCTypesystem.C>6198/6199/6200/6201/6202 Lines: create_property(pd,pnSupertype,stReference,true,*iSupertype); create_property(pd,pnImplObjects,stReference,false,*iImplObjects); create_property(pd,pnElttype,stReference,true,*iEltType); create_property(pd,pMinelt,stInteger,true,*iMinelt); create_property(pd,pnMaxelt,stInteger,true,*iMaxelt); Locations: </face/typesystem/SCTypesystem.C>6177/6178 </face/typesystem/SCTypesystem.C>6229/6230 Lines = duplicated lines Locations = file names and line number

  43. Enhanced Simple Detection Approach • Code Comparison Step • As before, but now • Collect consecutive matching lines into match sequences • Allow holes in the match sequence • Evaluation of the Approach • Advantages • Identifies more real duplication, language independent • Disadvantages • Less simple • Misses copies with (small) changes on every line

  44. Abstraction • Abstracting selected syntactic elements can increase recall, at the possible cost of precision

  45. Metrics-based detection strategy • Duplication is significant if: • It is the largest possible duplication chain uniting all exact clones that are close enough to each other. • The duplication is large enough.

  46. Automated detection in practice • Wettel [ MSc thesis, 2004] uses three thresholds: • Minimum clone length: the minimum amount of lines present in a clone (e.g., 7) • Maximum line bias: the maximum amount of lines in between two exact chunks (e.g., 2) • Minimum chunk size: the minimum amount of lines of an exact chunk (e.g., 3) Mihai Balint, Tudor Gîrba and Radu Marinescu, “How Developers Copy,” ICPC 2006

  47. Visualization of Duplicated Code • Visualization provides insights into the duplication situation • A simple version can be implemented in three days • Scalability issue • Dotplots — Technique from DNA Analysis • Code is put on vertical as well as horizontal axis • A match between two elements is a dot in the matrix

  48. Visualization of Copied Code Sequences Detected Problem File A contains two copies of a piece of code File B contains another copy of this code Possible Solution Extract Method All examples are made using Duploc from an industrial case study (1 Mio LOC C++ System)

  49. Visualization of Repetitive Structures Detected Problem 4 Object factory clones: a switch statement over a type variable is used to call individual construction code Possible Solution Strategy Method

  50. Visualization of Cloned Classes Class A Class B Detected Problem: Class A is an edited copy of class B. Editing & Insertion Possible Solution Subclassing … Class A Class B

More Related