1 / 17

Global Distribution of Developers for Open Source Projects: Insights and Analysis

Explore the geographical distribution of developers in open-source projects like SourceForge. Learn about the data gathering methods, results, and conclusions, shedding light on the global landscape of software development.

francisg
Download Presentation

Global Distribution of Developers for Open Source Projects: Insights and Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Geographical Locations of Developers at SourceForge:Gregorio RoblesJesus M. Gonzalez-Barahona Presented by Brian Chan Cisc 864

  2. Overview • Background • Motivation • Data Gathering Methods • Results • Conclusions

  3. Background Information • Developers are distributed across the world for projects • i.e. Libre Software • Hard to account for all the developers and harder to control these resources

  4. Motivation • To accurately account for all personnel in the world for a given project • Interesting for academic and economic reasons

  5. Data Gathering Methods • Two Primary Sources of Information: • Private email address • Time Zone of the User • Acquired from special database for research purposes

  6. Data Gathering Methods • Useful information in email and time zone: • ccTLD – Country Code Tope Level Domain • i.e. gsyc.escet.urjc.es Figure1.0 Figure 2.0

  7. Hard to Pinpoint Location when data consists of: gTLD (Generic Top Level Domain) Time Zones which are GMT (Greenwhich Mean Time) Figure 3.0 Data Gathering Methods

  8. Use Distributed Method to estimate where users should go: i.e. 22 users in one domain, 10 are unaccounted for due to GMT only Figure 4.0 Data Gathering Methods

  9. What if the user was actually in the GMT? Need to rebalance equation to account for data that was ignored Figure 5.0 Data Gathering Methods

  10. Weigh results by that factor Ratio of Own TZ to GMT for GMT countries same as Non-GMT regions Figure 6.0 Data Gathering Methods

  11. Different Types of data sets Figure 7.0 Data Gathering Methods

  12. Top 50 Countries account for 96.5% of developers in SourceForge. Top 20 Countries account for 83.9% of developers in SourceForge Figure 8.0 Results

  13. Most developers are from Europe and North America: almost 50-50 ratio Penetration in Libre Software higher in North America because Europe has higher population Figure 9.0 Results

  14. Conclusion • Method for redistributing developers to their place of origin • Not to identifying users to a single geographical location but aggregate numbers of developers of a certain national origin • Can be used to look for correlations which explain the GDP, the GDP per capita or other economic patterns

  15. Personal Thoughts • Good Points • Interesting Results-North America accounts for almost half of total activity • Interesting method for redistributing unknown data sources to certain region

  16. Personal Thoughts • Points for Improvement • Questionable: Is it really hard to ascertain the nationality of the developer or geographical location entry (even though private information) • SourceForge might be one of the most common open source systems, but is this indicative of all open source systems?

  17. Questions Comments

More Related