190 likes | 204 Views
This study explores an approach for modeling the reusability of EUP (End-User Programmable) code in a repository, allowing for more efficient searching and retrieval. By defining and computing features, reducing them through factor analysis, and applying linear regression, a model of reusability is constructed. The estimated reusability can then be used to sort search results, improving the overall user experience.
E N D
Code You Can Use: Searching for web automation scripts based on reusability James Admire, Abbas Al Zawwad, AbdulwahabAlmorebah, Sanchit Karve, Christopher Scaffidi Oregon State University
Online repositories of reusable EUP code offer many ways to find relevant code • Keyword-based search • Type keywords, receive a search result list of existing code available to reuse • Browsing by category • E.g., based on thematic categories or tags • “Related” code • E.g., by listing other code derived from a given piece of code
Finding high-quality EUP code to reuse is hard • Download counters and similar auto-generated popularity counts • But hardly any code is ever downloaded more than a trivial number of times • Explicit user-generated ratings of quality • But most code is never rated, certainly not by more than a few people • Curated collections of “featured” code • But scalability and sustainability are perennial challenges for curators
CoScripter web macro repository as a microcosm • Was one of the biggest repositories of web macros • Web macro = EUP script for automating browser interactions with web sites • > 6000 web macros when I last saw this repository • Prior studies showed hardly any macros were reused much • 9% run by 3 or more people • 7% run at least 6 times per user • Ultimately: Discontinued by IBM • Sustainability is a challenge!!! • 5% customized by any other user • 4% copied by any other user
Prior work has shown it possible to predict which web macros would be reused • But suppose a repository could predict from the moment of a macro’s creation whether it would be reused, so the search engine could emphasize or downplay the macro accordingly • Prior work • Collected 35 features of macros that seemed plausibly related to the understandability and modifiability of the macros, plus measures of reuse • Trained machine learning models to predict which macros would be reused (train a unique model for each measure of reuse) • Result: True positives of up to 90% at false positive rates in 10%-40% range • Similar results when replicated with two other repositories of EUPs’ code
Key limitations of that prior work • Predicted reuse, not reusability: Users might reuse enticing but low-quality code and then regret it. Sometimes, reuse != reusability. • Predicted binary measures: We would need to estimate level of reusability for sorting, not merely whether it will or will not be reused. • Relied only on data available at macro creation: Data such as user-generated ratings might help inform reusability estimates. • Provided no search engine: A proof of concept implementation would help to clarify any remaining technical hurdles.
Goal: An approach for modeling reusability of EUPs’ code, for use in sorting search results • Start with an existing repository that EUPs have used for a while • Define and compute features for EUPs’ macros in the repository • Reduce the feature set with factor analysis • Construct a model of reusability by linear regression of an expert user’s estimate of macro reusability versus the computed features • Sort macros by estimated reusability (at least in part) in search engine • Evaluate reusability estimates with another panel of experts as they use the search engine, and iterate the model in the search engine
Step 1: Getting a repository of EUPs’ code • CoScripter • Already had been in operation for approximately 5 years (since early 2008) • Already well-familiar with the repository due to our prior work • Already had a well-developed list of candidate features due to prior work • Already had permission to scrape macros and other data from the repository
Step 2: Defining features for macros • Selected 8 features from the 35 investigated in prior work • Statistically associated with reuse in both prior studies • Could be computed directly and automatically from available data • E.g., # comments, # parameters, 1 or 0 indicating if macro has a title • Created 21 features as refinements of the 35 from prior work • Macro age, and 20 different counts of code length • Created 8 new features based on new data suggesting user interest • Not previously considered, as these data accrue after macro creation • E.g., # times run, # users who ran it, # revisions, # comments about it
Step 3: Reducing features data with factor analysis • Factor = linear combination of features that are mutually correlated • Procedure • Randomly selected 100 macros • Computed our 37 features for each macro • Performed factor analysis • Discard all but the most salient factors (optimal coordinates method) • Result: 8 factors containing 17 features • Most of these retained features were related to code size, comments, and numbers of runs (e.g., total count or normalized by number of users)
Step 4: Constructing a model of reusability • Linear regression of reusability estimates versus factors • From Step 3, we could compute 8 factor scores as linear combinations of features • But just because factors exist doesn’t mean they are actually related to reusability! • So: Linear regression w/ depvar = reusability estimate, 8 indepvars = factor scores • Procedure • One team member (who did not help with defining or computing features) gave reusability estimate (range 1-4) to each of the 100 web macros • Result: Linear model that estimates reusability based on the features • Linear regression was highly significant (P=0.003) • 7 out of 8 factors had non-zero coefficients
Step 5: Searching for code based on reusability • Code You Can Use (CYCU) (pronounced “cuckoo”) • Compute reusability estimates offline • When user enters query, forward query to CoScripter repository, get back a list of macros, look up reusability estimates, and sort by estimated reusability Search results Keywords (offline)
Step 6: Evaluating reusability estimates with another panel of expert users • Using a different set of users than the one who gave initial estimates • Needed users who were pretty good at programming but who could approach CoScripter as an EUP tool rather than as a professional programming tool • 4 CS students, only one of whom had any experience as a professional programmer (<2 years), but all of whom were seniors or master’s • Using a different set of macros than those used to create the model • Manually reviewed CoScripter repository to see what was popular lately • Identified two themes: searching for houses and checking for flight information • Each of the 4 participants rated 20 of the 40 test macros • i.e., 2 participants rated each test macro
We collected 2 user-assessed reusability measures and 1 user-assessed relevance measure • Randomly ordered the macros and asked participants to rate (on a 4-point Likert scale)… • How helpful is this code in learning CoScripter? • How easy is it to understand the code? • How relevant is the code to the search term ‘search for houses’ [or ‘check airlines’]? • We expected that our reusability estimates… • Would significantly correlate with learnability and understandability ratings • Would not significantly correlate with relevance ratings
Result: Significant correlations appeared on all three measures Regression of each measure for each macro (averaged over participants) against reusability estimate. Note: Analysis utilized data for only 39 macros… one participant chose to skip a macro.
Further work could address threats to validity and limitations of this study • Different kinds of macros require different models of reusability • Indeed, our prior work showed different kinds of scripts require somewhat different features. • But the overall approach (compute features, combine features, validate) should methodologically generalize at least across textual scripting languages. • More sophisticated methods might be better for sorting search results based on integrating relevance with reusability estimates • Tool-builders might find this approach more onerous than we did • We built on hundreds of hours of our own prior work • Crucial work remains on overcoming barriers to tech transfer of EUP research
Exciting opportunities now exist for moving quality-based code search toward practice • Key contributions • New approach for modeling reusability of EUPs’ scripts • Demonstration of how such a model can be used in a search engine • Next steps • Elucidating and countering risks of users “gaming” the system by artificially boosting the apparent reusability of their code • Begin integrating reusability models into other, more sophisticated browsing and search methods (e.g., collaborative filtering or other search tools) • Investigating the impacts of applying this approach on day-to-day practice with a larger repository (e.g., impacts on learning by Scratch users) • Working with industry partners to apply this approach in their own repositories
Thank you • To you for your attention, interest, and ideas • To the VL/HCC reviewers for your compliments and suggestions • To IBM for permission to scrape the CoScripter repository • To the National Science Foundation for funding