310 likes | 441 Views
The “55M End-User Programmers” Estimate Revisited. Christopher Scaffidi. Table of Contents. Introduction to a Popular Estimate The Estimation Method 55M End-User Programmers in 2005 Extending the Method 90M End-Users in 2012 A Survey of End-User Abstraction Practices Conclusion.
E N D
The “55M End-User Programmers” Estimate Revisited Christopher Scaffidi
Table of Contents • Introduction to a Popular Estimate • The Estimation Method • 55M End-User Programmers in 2005 • Extending the Method • 90M End-Users in 2012 • A Survey of End-User Abstraction Practices • Conclusion 55M End-User Programmers > Table of Contents
Introduction to a Popular Estimate • 55 Million End-User Programmers by 2005 • “End-User” = • “The ultimate consumer of a product, especially the one for whom the product has been designed.” (Dictionary) • “People who are not employed as programmers” (citation on next slide) • “Programmers” = • People who act “to create an application that serves some function” (Nardi, A Small Matter of Programming) • Researchers often use the term to include creators of spreadsheets. 55M End-User Programmers > Introduction to a Popular Estimate
Example #1 of Estimate’s Usage • Context: • The authors of this conference paper added more abstraction capabilities to Excel, to boost Excel’s utility • Usage: • “The number of end-user programmers in the U.S. alone is expected to reach 55 million by 2005, as compared to only 2.75 million professional programmers” • Appeared in: • S. Jones, A. Blackwell, and M. Burnett. A User-Centered Approach To Functions in Excel. Proceedings of the 8th ACM SIGPLAN International Conference on Functional Programming, ACM Press, 2003, pp. 165-176. 55M End-User Programmers > Introduction to a Popular Estimate
Example #2 of Estimate’s Usage • Context: • The magazine author discusses a grant awarded by NSF for research on improving the reliability of spreadsheets • Usage: • “Experts estimate the number of so-called 'end-user programmers’ to reach 55 million by 2005,” said NSF spokesperson David Hart… “Nearly half of the programs created by these end-users have nontrivial bugs.” • Appeared in: • Mike Martin, New Program Exterminates End-User Bugs. CIO Today, NewsFactor Network, June 9, 2004. 55M End-User Programmers > Introduction to a Popular Estimate
Introduction to a Popular Estimate • Used in many places • Journal articles • Conference papers • Workshop papers • Grant applications? • Trade magazines • Web sites • Used to make an important point • There are a lot of end-user programmers(in fact, many more than professional programmers). • Therefore they are a significant group of programmers. • Therefore we should not neglect their needs. 55M End-User Programmers > Introduction to a Popular Estimate
The Estimation Method • First appeared in COCOMO 2.0 • COCOMO is a cost estimation model from Boehm et al. • Extended into COCOMO 2.0 (late 1990’s) modern practices • COCOMO 2.0 is for professionals (not end-users) • How many people would/wouldn’t benefit from COCOMO 2.0? • To answer this, Boehm estimated projections of… • # of professional programmers (2.75M by 2005) • # of end-user programmers (55M by 2005) • B. Boehm et al. Cost Models for Future Software Life Cycle Processes: COCOMO 2.0. Annals of Software Engineering Special Volume on Software Process and Product Measurement (J. Arthur and S Henry, eds), J.C. Baltzer AG, Science Publishers, Amsterdam, The Netherlands, 1995. • Also widely disseminated through a book by Boehm in 2000, as well as IEEE Software. 55M End-User Programmers > The Estimation Method
The Estimation Method • Steps to generate the estimate • Get the Bureau of Labor Statistics (BLS) occupation projections for 2005 55M End-User Programmers > The Estimation Method
The Estimation Method • Steps to generate the estimate • Get the Bureau of Labor Statistics (BLS) occupation projections for 2005 • Get the BLS computer usage rates by occupation for 1989 (which were actual data from a survey, not a projection) 55M End-User Programmers > The Estimation Method
The Estimation Method • Steps to generate the estimate • Get the Bureau of Labor Statistics (BLS) occupation projections for 2005 • Get the BLS computer usage rates by occupation for 1989 (which were actual data from a survey, not a projection) • Multiply occupation projections by computer usage rates and total up Sum of all end-user programmers turns out to be -----> 55 M 55M End-User Programmers > The Estimation Method
The Estimation Method • Steps to generate the estimate • Get the Bureau of Labor Statistics (BLS) occupation projections for 2005 • Get the BLS computer usage rates by occupation for 1989 (which were actual data from a survey, not a projection) • Multiply occupation projections by computer usage rates and total up • Bottom line = 55M end-user programmers in 2005 55M End-User Programmers > The Estimation Method
Extending the Method • Main inherent approximations • Computer usage rates by occupation will remain constant from 1989 through 2005 • All end-users are programmers 55M End-User Programmers > Extending the Method
Extending the Method • Main inherent approximations • Computer usage rates by occupation will remain constant from 1989 through 2005 • All end-users are programmers • Address these by • Using additional data to estimate how usage rates have grown • Developing a classification of end-users to capture their continuum of programming-like activities 55M End-User Programmers > Extending the Method
Approximation #1: Constant Usage Rates • New computer usage rate data became available • Boehm based his estimate on usage rates measured in 1989 • BLS also measured those rates in 1984, 1993, and 1997 • A valid approximation? • Not very • Usage rates have grown substantially for each of the occupational categories studied by BLS • In fact, in 1997, there were already around 64M end-users 55M End-User Programmers > Extending the Method
Approximation #1: Constant Usage Rates • Interesting curve shape • Most of these curves (especially the lower ones) seem to have an S-shape trending to a horizontal asymptote 55M End-User Programmers > Extending the Method
Approximation #1: Constant Usage Rates • Innovation diffusion theory to the rescue • Researchers have realized that innovations diffuse through populations like diseases. • They have studied various functional forms for describing this. • The simplest form (and most generally applicable) is S-shaped • J. Teng, V. Grover, and W. Güttler. Information Technology Innovations: General Diffusion Patterns and Its Relationships To Innovation Characteristics. Transactions on Engineering Management, Vol. 49, No. 1, February 2002, pp. 13-27. 55M End-User Programmers > Extending the Method
Approximation #1: Constant Usage Rates • Projecting the computer usage rates • The S-shaped functional form had 3 free parameters (K, m, b) • We have 4 measurements from BLS (1984, 1989, 1993, 1997) • So we can fit to functional form for each occupation category • (Note that with so few points, “goodness of fit” means little.) • A somewhat better estimate • Get the BLS’s latest occupation projection (which happens to be for the year 2012) • Plug in t=2012 to forecast future computer usage rates • Multiply and sum as Boehm did • Result: 90M end-users in 2012 55M End-User Programmers > Extending the Method
90M End-Users in 2012 • This uses a different approximation than Boehm’s • He assumed 1995 usage rates would equal 1989 usage rates. • We assume 2012 usage rates are predictable using a simple fit to the innovation diffusion function. 55M End-User Programmers > Extending the Method
90M End-Users in 2012 • This uses a different approximation than Boehm’s • He assumed 1995 usage rates would equal 1989 usage rates. • We assume 2012 usage rates are predictable using a simple fit to the innovation diffusion function. • Implication of using our assumption • Fairly questionable assumption! On-going improvements in computers will probably drive adoption still higher. • Therefore, 90M is probably something of a lower bound. 55M End-User Programmers > Extending the Method
What Does “Programmer” Mean? You keep using that word. I do not think it means what you think it means. • --Inigo Montoyo, Princess Bride 55M End-User Programmers > Extending the Method
Approximation #2: All End-Users Program • Usefulness of a big scalar number • 55M or 90M is a number with no structure • Thus, it can only be used to argue, “This sure is big.” • Usefulness of a collection of numbers • Can we break down the estimate into smaller groups? • Doing this right could help guide research and development. 55M End-User Programmers > Extending the Method
Approximation #2: All End-Users Program • Usefulness of a big scalar number • 55M or 90M is a number with no structure • Thus, it can only be used to argue, “This sure is big.” • Usefulness of a collection of numbers • Can we break down the estimate into smaller groups? • Doing this right could help guide research and development. • Possible categorizations • By industry (e.g.: shipping, manufacturing, transportation, …) • By occupation (e.g.: secretary, accountant, manager, …) • By education (e.g.: K-12, college, professional, …) • By technology skills (e.g.: Java, Oracle, HTML forms, …) • By enduring programming skills (e.g.: abstraction mastery, …) 55M End-User Programmers > Extending the Method
Approximation #2: All End-Users Program • In building tools, researchers focus on abstractions • Helping end-users represent abstractions as functions:S. Jones, A. Blackwell, and M. Burnett. A User-Centered Approach to Functions in Excel. Proceedings of the 8th ACM SIGPLAN International Conference on Functional Programming, ACM Press, 2003, pp. 165-176. • Helping end-users map domain models to web app models:K. Kim, J. Carroll, M. Rosson. An Empirical Study of Web Personalization Assistants Supporting End-Users in Web Information Systems. IEEE 2002 Symposia on Human Centric Computing Languages and Environments, September 2002, pp. 60-62. • Helping end-users identify abstractions from examples:M. Balaban, E. Barzilay, M. Elhadad. Abstraction as a Means for End-User Computing in Creative Applications. IEEE Transactions on Systems, Man and Cybernetics, Part A, Vol. 32, No. 6, November 2002, pp. 640-653. • Helping end-users model abstractions in general:F. Paternò. From Model-based to Natural Development. Proceedings HCI International 2003, Universal Access in HCI, pp.592-596. 55M End-User Programmers > Extending the Method
Approximation #2: All End-Users Program • What abstraction issues are important? • We now have an improved estimate of how many end-users. • Actually, we also have surveys of what software they use. • We don’t have any survey of what abstractions they are using. • So what abstractions are important for new tools to address? 55M End-User Programmers > Extending the Method
Approximation #2: All End-Users Program • What abstraction issues are important? • We now have an improved estimate of how many end-users. • Actually, we also have surveys of what software they use. • We don’t have any survey of what abstractions they are using. • So what abstractions are important for new tools to address? • Study users’ needs and practices before building • That’s part of what I argued (in a business context) during my practicum talk last fall. • Why not apply it to research, too? 55M End-User Programmers > Extending the Method
Anticipated Work for 2005 • Phase 1: Informal survey of abstraction practices • About to go live (~ Feb 7) • On-line aspects handled by partner, Information Week • Ask about usage of abstraction-oriented programming features • Referencing data vs making copies (e.g.: using variables) • Encapsulating reusable algorithms (e.g.: using functions) • Representing common structures (e.g.: using data structures) • Ask about usage of other good programming practices • Documentation • Back-ups • Testing • Ask about usage of the web • Source/destination of documentation • Source/destination of data • Source/destination of other artifacts • Ask about background (for use as explanatory variables) 55M End-User Programmers > Extending the Method
Anticipated Work for 2005 • Phase 2: Direct the survey at a controlled sample • We have IRB approval through June 1 • Tentative sample is marketing professionals • They program with numbers, text, and rich text… very diverse. • They likely program more than most end-users ( upper bound). • Other options suggested by researchers: accounting & operations. • We’ll tweak the survey based on Information Week feedback. • Phase 3: Target subgroups with interviews • Tentative dates: Fall 2005. • Just because people use a programming feature doesn’t mean that they actually understand the abstraction behind it. • Just because people don’t use a feature doesn’t mean they wouldn’t value it if it were implemented better. • Interviews let us “get under the hatch” into these issues. 55M End-User Programmers > Extending the Method
Conclusion • “55M End-User Programmers” is a popular estimate • It makes the point that end-user programming is an important area of research! • The estimate embodies two main approximations • Constant computer usage rates • All end-users are programmers • We can begin to remove these approximations • Model adoption rates using innovation diffusion theory • New estimate: 90M end-users in 2012 • Study end-users according to a classification scheme • Use surveys and interviews to get guidance on research 55M End-User Programmers > Conclusion
Any Questions? • The most powerful productivity strategy is to equip line workers with generalized programs and then to turn them loose. • The same strategy, with generalized mathematical, statistical and programming capabilities will work for scientists. • --Paraphrased from “No Silver Bullet: Essence and Accidents of Software Engineering”, Frederick Brooks, Computer Magazine, April 1987 55M End-User Programmers > Summary
Example #3 of Estimate’s Usage • Context: • The author of this workshop paper describes why existing model-driven development approaches do not work well for end-user programmers • Usage: • “Studies report that by 2005 there will be 55 million end-users, compared to 2.75 million professional users” • Appeared in: • F. Paternò. From Model-based to Natural Development. Proceedings HCI International 2003, Universal Access in HCI, pp.592-596. 55M End-User Programmers > Introduction to a Popular Estimate
The Estimation Method • Screenshot taken fromB. Boehm et al. Cost Models for Future Software Life Cycle Processes: COCOMO 2.0. Annals of Software Engineering Special Volume on Software Process and Product Measurement (J. Arthur and S Henry, eds), J.C. Baltzer AG, Science Publishers, Amsterdam, The Netherlands, 1995. • Also widely disseminated through a book by Boehm in 2000, as well as IEEE Software. 55M End-User Programmers > The Estimation Method