250 likes | 261 Views
This article explores the challenges faced by online communities due to networking overload and the importance of using XML vocabularies for social network analysis and optimization. It proposes measurements for social network health and suggests requirements for an XML vocabulary that supports high levels of social network health. The article also discusses the history of social software and current efforts in the field.
E N D
XML Vocabularies for Online Communities: Past, Present, and Future By William Barnhill
The Problem • Online communities growing at ever-increasing rates • Users experiencing networking overload • XML vocabularies used by communities must aid social network analysis and optimization • Otherwise social networks will keep decreasing in value as they increase in size
The Goals • A reasonable working definition of Social Network health • A list of suggested requirements for an XML vocabulary, or combination of vocabularies, that aid in creating and maintaining high levels of Social Network health • Selection of one such vocabulary combination • Description of the benefits such a combination would bring to enable an as yet fictional ideal community
Proposed measurements for Social Network (SN) Health • We propose Social Network health is proportional to... • The benefit the average SN member derives • Inter-connectedness (Reach) among members • The number of connected, healthy Social Networks [modified Reed's Law] • Speed at which good ideas are propagated • The percentage of the network’s members who are Mavens[1] (experts who love to teach) • The percentage of the network’s members who are Connectors[1] (who know members of many networks) • [1] See “The Tipping Point”, Malcolm Gladwell
Proposed measurements for Social Network (SN) Health, Pt. 2 • We propose Social Network health is inversely proportional to: • Social Network member turnover rate • # of connected Social Networks with low health • Allen distance = Abs(#members – 50) • Average Dunbar distance (D) of members • D(m) = abs(# of m's stable relationships - 150) • Speed at which bad ideas are propagated • Percentage of the network’s members who are Leeches (love to learn, but only take information/energy, not contributing it back to the group)
What is Social Software? • Wikipedia: • applications which facilitate virtual connection and collaboration between people on a network. It is sometimes described succinctly as "connection comes before content." • K. Eric Drexler, promoted by Clay Shirky: • software that supports group communications • Tom Coates • Social Software can be loosely defined as software which supports, extends, or derives added value from, human social behaviour • Another working definition • Social software is software that aids in one or more of the formation, maintenance, and/or securing of one or more social networks.
History (1970s-2000s) • 1970s • EIES • PLATO • Email: cc line • 1980s • Groupware • CSCW • 1990s • Groupware • Social Software slow growth • 2000s • Social Software takes off • Source: http://www.lifewithalacrity.com/2004/10/tracing_the_evo.html
Current Efforts • Blogging becoming mainstream • Communities • IDCommons • PlanetWork • Ryze • LinkedIn • Tribes.Net • LiveJournal • Non-text blogging (voice, photo, and others) • Purchase circles • Folksonomies • Social Network Analysis tools - BlogDex
Vocabulary Challenges • Tool and content authors in domain are often not graph theory experts, and want to read/write raw data • No well-defined set of required information to describe a person • Vocabulary will likely be extended within a short time for specific types of networks • For many networks, keeping data within communities of trust is a necessity • The amount of data within a Social Network can be very large, requiring efficient parsing
Vocabulary features – A Starting Point • Start with DyNetML “Requirements for Data Interchange”, http://casos.isri.cmu.edu/dynetml/index.html • contained in easily parsed & human-readable text files • allow an entire dataset, w/computed measurements, in one file • provide maximum expressive power (see next slide) • allow developers to extend it in a fashion that will not break existing software • flexible enough to be used as both input and output of analysis tools.
Features for expressiveness • Typed nodes (types may include "person", "resource", "organization", "knowledge", etc) • Multiple sets of nodes of the same type (to express multiple units within the company, etc) • Multiple typed attributes per node • Typed edges • Multiple typed attributes per edge • Multiple graphs (sets of edges), and dynamic network data, expressed within the same file
Proposed additional features • Alternative forms of data, XML and otherwise, should be cleanly embeddable within the data interchange format to enable extension and adoption • The number of elements and attributes should be minimal to ease understanding, ideally just resource identifiers, resource data, and related resources (i.e. metadata) • The general meaning of a data set should not require a background in graph theory • Built-in data interchange access control
Some XML vocabularies for possible use by Social Software • DyNetML • XFN • Feed variants (Attention, RSS, Atom) • OWL/RDF • XDI
DyNetML • Produced at CMU by Maksim Tsvetovat, Jeff Reminga, Kathleen M. Carley • Advantages: • Very rich representation of social network as graph • Very helpful for Social network analysis, easy conversion to other SNA formats (UCINET DL) • Disadvantages: • Primarily for Social Network Analysis, unwanted overhead for Social Network • Syntax is geared to time capture of SN metrics • Very close to XML representation of a generic graph, requiring some graph theory background
XFN • A lightweight method of annotating links to indicate a personal relationship with the person responsible for the linked resource • Ex: <a href="http://jeff.example.org" rel="friend met"> • Advantages: • Easily understood, is rel attribute • Easily extended • Easy to embed with existing web pages • Disadvantages: • Relationship data only • Xlink with less features and so less complexity • What about non-HTML?
Reviewing feed variants • The variants: RSS, Attention, Atom • Advantages: • Widespread adoption and dual support for RSS/Atom • Format easily understood and parsed • Clear winner in tracking/publishing streams of discrete items, ex: blogs • Disadvantages: • Data syndication, work would be needed to use in all aspects of an online community • Fractured formats, varying tool compliance • <link> element is weak for expressing rich relationships
Reviewing RDF • Resource description format (not necessarily XML) • Advantages: • Rich metadata capture • Only four underlying concepts (Resource, Property, Statement, XML statement expression) • Large body of work (FOAF, etc) • Disadvantages: • Implementation of RDF is complex, not understood well by many • Syntax is verbose, and can be difficult for humans to read • Too many ways of encoding the same network/graph
Reviewing XDI • XDI: OASIS XDI TC effort to create secure distributed data access protocol using Extensible Resource Identifiers (XRIs). The XRI specification is architected by the OASIS XRI TC. • Advantages: • Rich access control on traversal of links between resources (link contracts) • Simple core concepts and implementation: XRIs, linked resources, and data. This means format easily understood • Relationships also can be expressed as triples • Meets all our requirements • Disadvantages: • New, not much to build on yet
What we didn't choose and why • Note: They are all useful within an online community, but our goal was to pick one or two that are the best for general use within our online community. The reasons for this for many, primarily to keep the design clean and to prevent duplication of effort since there is significant overlap in purpose. • DyNetML: Didn't meet our additional requirements, and too formal for easy understanding • XFN: Links only, HTML specific • Feed variants: Definitely needed in our online community, but for syndication, not general use • RDF: Too complex for general use by itself in our use case, but ideal for semantically describing data.
What we did choose • XDI + RDF • FOAF RDF for... • data about community members • simple interpersonal relationships • XDI for... • data about the community • Access control contracts • linking FOAF data by embedding within XDI • Sharing personal data that would require non-standard FOAF properties • Extensibility through data sharing dictionaries
How XDI/RDF can help community membership • A key point in XDI is that data is consumed from its natural home. • You can view community participation as a set of data interchanges: participating in a chat, sharing your real name and an email address, sharing the information necessary to deduct payments, etc. • Joining an XDI-enabled community then becomes a matter of negotiating data sharing control data (links) • Groups can act as individuals in these negotiations, allowing two groups to establish data sharing contracts
How XDI/RDF can help form communities of trust • Because XDI has fine-grained control over data sharing, you could specify different privileges to shared data on an individual or a group basis • XDI is based on XRI, the OASIS specification for uniform abstract identifiers. Persistent XRIs enable long-term trust relationships, including reputation mechanisms that can operate within and across group and community boundaries. • Control of data is never relinquished by the authority on that data
How XDI/RDF can help build Social Bridges • Using XRIs and secure data sharing online communities can, if members allow, automatically share information with other trusted groups, spontaneously forming communities of interest from user's FOAF descriptions of themselves • Endorsements and reputation information can be propagated through trusted groups, creating community chosen expert individuals and expert groups • Since groups can act as individuals with their own identity, a group can create trusted data sharing links with other groups as easily as individuals create these links between themselves
Where to go for more information on XDI • OASIS TC – The XDI standards committee site: http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xdi • OASIS TC – The XRI standards committee site: http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xri • “The DataWeb: An Introduction to XDI”: http://www.oasis-open.org/committees/download.php/6434/wd-xdi-intro-white-paper-2004-04-12.pdf • “The Social Web: Creating An Open Social Network with XDI”: http://journal.planetwork.net/article.php?lab=reed0704
See XDI in Action • The XDI TC will be holding a general presentation and demonstration for other OASIS members and staff at 1 pm (the second hour of lunch) on Wednesday.