120 likes | 302 Views
Intro. Doing Away With Backfiring Converting Function Points To Lines of Code. Mike Kimel and Lee Fischman Galorath Incorporated. Carl A. Dalton - Director Carl A. . The Problem. Having many, small disparate data sets makes it difficult to perform estimation
E N D
Intro Doing Away With BackfiringConverting Function Points To Lines of Code Mike Kimel and Lee Fischman Galorath Incorporated Carl A. Dalton - Director Carl A.
The Problem • Having many, small disparate data sets makes it difficult to perform estimation • Merging data sets can improve estimates, leading to more robust models • But, different data sets have different metrics of project size • Then, before combining data sets, a metric mapping is needed.
Existing Approaches – Capers Jones Capers Jones’ is the most widely used reference; his method: Counting Function Points And Source Code Actual counts of Function Points and source code statements were performed. Samples of counting Function Points and source code statements were done on Ada, several BASIC dialects, COBOL, PASCAL, and PL/I. Counting Source Code Source code statements were counted, then compared to the size of the same program in languages of known levels. Assembly, APL, C, OBJECTIVE C, FORTH, FORTRAN, LISP, PILOT, and PROLOG are languages that produce the same source code count as COBOL. So code sizes were compared to the known quantity of COBOL source code. Inspecting Source Code Source code inspection for common applications was done. Then the volume of code for the application in a measured language was hypothesized. ACTOR, CLARION, and TRUE BASIC are examples of languages that were inspected and their levels hypothesized by subjective means. Researching Languages Research was done by reading descriptions and genealogies of languages and making an educated guess as to their levels. KL, CLOS, TWAICE, and FASBOL are examples of languages that were assigned tentative levels merely from descriptions of the language, rather than from actual counts.
Existing Approaches – SEER • Currently used in SEER-SEM. This method of ‘complexity by analogy’ allows: • Less-understood languages to be more readily supported • Rating of function points by a factor indicative of potential effort
Comparison ofCapers Jones’ and SEER Methods To get a certain piece of functionality... Jones believes that lower level languages (1GLs…3GLs) require more equivalent statements than we do. He believes that higher level languages(4GLs and above) require fewer equivalent lines of code than we do. (As language level increases, fewer statements to lines of code are required…)
Data Sets • IFPUG • metric: function points • 154 observations • 83 variables • “Defense Contractor” • metric: Delivered Source Lines of Code (SLOC) • 211 observations • 44 variables
Methodology • Estimate function: • Effort = f(Size, Other Variables) • Obtain • (1) Effort = f(SLOC, X1, X2, … , XN) • (2) Effort = f(Function Points, Z1, Z2, … , ZM) • Then, can set (1) equal to (2) and solve for • (3) SLOC = f(Function Points, X1, X2, … , XN, Z1, Z2, … , ZM)
Estimate Results – Function Points Estimating (1) using a Stepwise Regression (Px = 0.05)
Estimate Results – SLOC Estimating (2) using a Stepwise Regression (Px = 0.05)
Blending the Metrics • As noted earlier, set (1) = (2) and solve for SLOC, obtain a function (3), which can be used to develop SLOC values for each point in the IFPUG data set • (3) can be rewritten as SLOC = A + B. • A - available in the IFPUG data set • B - not available in IFPUG data set - needs to be proxied (e.g., use average values from SLOC data set)
Blending the Metrics –Potential Improvements • Improved functional forms • Improve estimating methods • Improve method of proxying SLOC data in part B of equation (3) • Mapping to multiple SLOC data sets (e.g., use DACS data set) and use an “average” mapping