380 likes | 652 Views
Copy to Create: Patterns of copying and reuse in Web 2.0. M. Cameron Jones mjones2@uiuc.edu , http://cameronjones.com/ Graduate School of Library and Information Science University of Illinois at Urbana-Champaign. How It’s Made. How are information technologies created? By professionals
E N D
Copy to Create: Patterns of copying and reuse in Web 2.0 M. Cameron Jones mjones2@uiuc.edu , http://cameronjones.com/ Graduate School of Library and Information Science University of Illinois at Urbana-Champaign
How It’s Made • How are information technologies created? • By professionals • By novice programmers • By end-users • What aspects of the technology landscape enable or facilitate technology creation? • How can we better design systems, and tools to support IT production?
The “Copying Meme” People create things often by sharing with, and copying from each other People learn how to create things by copying (Mackay, 1990; Nardi & Miller, 1991) Copying and sharing code helps bridge the “expertise gap” (Nardi & Miller, 1990) Professional programmers also copy source code when programming (Kim et al. 2005).
What is “Web 2.0”? Markus Angermeier retrieved from http://kosmar.de/wp-content/web20map.png
How does Web 2.0 function? • Who is web 2.0? • How do people… • contribute? • create content? • program mashups? • consume and recycle material? • What is unique about these interactions? • What is familiar?
Outline Web Mashups Recent research on mashup programming Yahoo! Pipes • Authorship in Pipes • Social Networks of Cloning Mapping Mashups • Google Maps • Yahoo! Maps
Web Mashups • Websites which combine data and services from across the web • What is interesting about web mashups • Creativity • Innovation • Exploration • Functional “solutions” • Store (and share) knowledge/expertise
HousingMaps.com HousingMaps.com = Google Maps +Craigslist
Student Projects Campus Schedule Planner
Student Projects Champaign-Urbana Bus Route Planner
… but Mashups are hard! Examples presented are exception, not the norm (e.g., Jones & Twidale, 2006; Jones, Twidale, Urban, 2007) High threshold (lots of diverse knowledge and skills needed) Cryptic APIs (each one is different, and often changes) But there may be hope!
Yahoo! Pipes Data Pipe Creator Pipe Name Date Published # times copied Screenshot taken June 18, 2007 from: http://pipes.yahoo.com/pipes/pipes.popular
Authorship in Pipes 18,680 pipes (collected June 6, 2007) 11,868 authors
Long-tail Distributions Book sales City population sizes Web page hits Authorship in scholarly journals Authorship in the Wikipedia Citation counts in scholarly writing Project membership in Open-Source
Yahoo! Pipes Data Copies identified by the name
Social Networks of cloning 1,856 clones identified with names “Copy of …” or “… copy” Identify who was cloning from whom by trying to determine the author of the original pipe being cloned (an inexact measure) 1,579 pipe authors nodes in network 1,483 edges representing the “cloned a pipe from” relationship
Social network of Pipe Cloning absoluttodd24 DanielRaffel Pasha Sadri Edward H franticindustries
What pipes are being cloned? Examples (DanielRaffel, Edward H, Pasha Sadri) Cloned pipes
Clusters of clones Aggregated News Alerts Example: pipes Apartment NearSomething & del.icio.us Web Search eBay Price Watch
Factors determining cloning franticindustries Activating social ties
Factors determining cloning Cumulative Advantage Distributions (Simon, 1957; Price, 1976)
Further topics to be explored What modules are most frequently used? What modules are most frequently cloned? How are the pipes modified or changed when they are cloned? Are they modified? Are portions (subsets of modules) of pipes copied into new pipes? What subsets?
Copying Code in Programming What about mashup programming more generally where there isn’t a simple “clone” function? HTML – “View Source” Rosson and Caroll (1993) qualitative study of professional SmallTalk programmers found code was often copied from documentation examples.
Map-based Web Mashups ~58% of Mashups on Programmable Web are mapping mashups (1,178/2,038) Three main API providers:Google Maps (50% of all mashups)Yahoo Maps (4% of all mashups)Microsoft Virtual Earth (4% of all mashups) How are people coding and constructing Map Mashups?
Data Collection - Mashups Downloaded JavaScript source code for all mapping mashups listed Problems: Dead links and inaccurate URLs Google Maps: 494 unique mashups Yahoo Maps: 94 unique mashups Microsoft Virtual Earth: 17 unique mashups
Data Collection - Snippets Downloaded JavaScript example snippets from API provider documentation. Google Maps: 32 (11+21) example snippets Yahoo Maps: 16 example snippets Microsoft Virtual Earth: 65 example snippets Microsoft Virtual Earth excluded from further analysis (not enough data)
Data Analysis Clone Analysis on source code Identify “code clones” in the Mashup code Clone Pair: “a pair of source code segments that are structurally or syntactically similar” (Kapser & Godfrey, 2003). Use source code cloning to identify what code is being copied and how mashups are related.
Software Clones function load() { if (GBrowserIsCompatible()) { var map = new GMap2(document.getElementById("map")); map.addControl(new GSmallMapControl()); map.addControl(new GMapTypeControl()); map.setCenter(new GLatLng(37.4419, -122.1419), 13); // Create our "tiny" marker icon var icon = new GIcon(); icon.image = "/ridefinder/images/mm_20_red.png"; icon.shadow = "/ridefinder/images/mm_20_shadow.png"; icon.iconSize = new GSize(12, 20); icon.shadowSize = new GSize(22, 20); icon.iconAnchor = new GPoint(6, 20); icon.infoWindowAnchor = new GPoint(5, 1); // Add 10 markers to the map at random locations var bounds = map.getBounds(); var southWest = bounds.getSouthWest(); var northEast = bounds.getNorthEast(); var lngSpan = northEast.lng() - southWest.lng(); function load() { if (GBrowserIsCompatible()) { var map = new GMap2(document.getElementById("map")); map.addControl(new GSmallMapControl()); map.addControl(new GMapTypeControl()); map.setCenter(new GLatLng(37.4419, -122.1419), 13); } } From: gmap.doc.3.mash From: gmap.doc.15.mash
Data Analysis • Filter applications w/o clones • Google Maps: 505 applications • Yahoo Maps: 101 applications • Filter intra-application clones • Google Maps: 5,731 clones • Yahoo Maps: 2,718 clones • Dichotomous application-by-clone occurrence matrix • Hamming %-difference distance measure • Classic, metric multi-dimensional scaling
Yahoo Maps Clones Yahoo! Widgets Microsoft Virtual Earth MochiKit
Conclusions • Copying and sharing are essential components of technology production • Example and documentation are heavily used in the processes of creating and learning to create • What is familiar? • Patterns of production appear to be similar to other media and contexts • What is unique about copying on the web? • The scale of web systems - many of the statistical tests and measures do not adequately cope.
Future Research • Collect code snippets from other sources • Forums • Mailing lists • Coding websites • Collect code from other sources • PHP-language Open-Source projects • Analyze and classify the clones • What code is being copied? Why? • Other domains and contexts
Copying Across the web MySpace layouts RSS aggregation SpamBlogs Video and music mashups CyWorld “scraping” Online learning (Inquiry Page)
Scrapbooking in CyWorld Source of “scrap”