250 likes | 330 Views
Folksonomies in Publishing. Just Good Enough for Lots of Things June 19, 2008 Steve Carton. Agenda. What’s the Problem? Common Finding Aids – Indexes, Taxonomies, Thesauri. Folksonomies. A Specific Example How to get there. Who Are We?. Retrieval Systems Corporation
E N D
Folksonomies in Publishing Just Good Enough for Lots of Things June 19, 2008 Steve Carton
Agenda • What’s the Problem? • Common Finding Aids – Indexes, Taxonomies, Thesauri. • Folksonomies. • A Specific Example • How to get there.
Who Are We? • Retrieval Systems Corporation • Content Management Systems, • Search Systems • Managing and delivering complex information. • Working with publishers to stay ahead in a tough market.
Problem Statement • The quantity of information has grown exponentially. • Publishing now includes anyone putting information out on the internet. • Content is a thing to be managed with buzzwords like single-sourcing, XML, multimedia, syndication, blogs, wikis, and includes books, documentation, audio, video and thought streams. • Finding the right bit of information can be a nightmare. • Search engines provide keyword access to text-base content. • Users want more accuracy. • Publishers are cutting costs. • Finding aids are helping separate publishers in similar markets. • Classifications, taxonomies and indexing with arcane rules and procedures Maintenance is a big effort -- sometimes larger than the content management effort.
Traditional Finding Aids • Taxonomies, Indexes, Thesauri • Specialized Training • Costly • Often ineffective for diverse user populations.
Traditional Finding AidsIndexes, Taxonomies, Thesauri, Oh My!
Folksonomies to the Rescue • Folksonomies offer a new approach to an old problem -- how to capture the "aboutness" of content.
Folksonomies(Huh?) • “Thomas Vander Wal” • Web 2.0 Construct • Social Networking • Del.icio.us, Blogs, Tag Clouds, etc. • Why not publishing? • Folksonomies are a “Tom Sawyer” approach to classification of content!
How Do They Work? • Content is exposed to a large group of users. • Typically through a browser • Tagging plugins (Del.icio.us, others) • Users tag content.
So What Do We Get? • Content is indexed. • Costs are reduced. • The larger user community gets an index-based finding aid.
Is There a Downside? • Needs a lot of indexers/users. • Reduced precision. • Lack of control. • Or, at least, lack of perceived control!
The Whim of the Users? • Seems like a bad idea. • But… • We do retain control. • Studies show a trending pattern of tags. • And we get more perspective. • People of different backgrounds and experience see content in different ways.
So, Let’s Create An Example • Voting Records: • Roll call from the Clerk of the House • Bill text from the Library of Congress • All available in XML. • Congress-person’s stand on an issue. • To get this, we need to index the bills by issue.
Is this “ProEnvironment” or “AntiEnvironment”? Y Roll Call 84 Steny Hoyer(D, Md 5th District) Roll Call 82 HR5351 Renewable Energy and Energy Conservation Tax Act of 2008 N Roll Call 83
Tagging • Use browser plugin to tag • Collect tags and results • Manage Tagging
Collect Tags • <tags> • <tag count="2" tag="AntiEnvironment"/> • <tag count="2" tag="ProEnvironment"/> • <tag count="1" tag="WhoCares"/> • </tags>
Retrieve Tagged Content • <posts update="2008-06-19T07:42:30Z" user="steve.carton"> • <post href="http://thomas.loc.gov/cgi-bin/bdquery/z?d110:h.r.01834:" description="H.R.1834" extended="To authorize the national ocean exploration program and the national undersea research program within the National Oceanic and Atmospheric Administration." hash="94493a3003396950f8d4775acc4d42d5" tag="WhoCares ProEnvironment" time="2008-06-19T07:41:17Z"/> • <post href="http://thomas.loc.gov/cgi-bin/bdquery/z?d110:h.r.00816:" description="H.R.816" extended="To provide for the release of certain land from the Sunrise Mountain Instant Study Area in the State of Nevada and to grant a right-of-way across the released land for the construction and maintenance of a flood control project." hash="61e41b94ac36555eb117f47bbd27a70d" tag="AntiEnvironment" time="2008-06-19T07:35:01Z"/> • <post href="http://thomas.loc.gov/cgi-bin/bdquery/z?d110:h.r.05351:" description="H.R.5351" extended="To amend the Internal Revenue Code of 1986 to provide tax incentives for the production of renewable energy and energy conservation." hash="831f8e9d15c29a5e90d83aae4558efbc" tag="ProEnvironment AntiEnvironment" time="2008-06-16T21:42:24Z"/> • </posts>
Ways to be Tom Sawyer • Use Del.icio.us or another tool. • Expose content to users, perhaps with incentives for indexing. • Work within the publishing house, especially if it’s large. • Amazon’s Mechanical Turk.
In Conclusion • User are demanding more accurate content location. • Publishers need better finding aids to separate themselves in the market. • Costs and accuracy of traditional approaches are increasing. • Folksonomies offer a new and different approach. • Exchange control and precision for volume and diversity. • The trick is getting a volume of users.
Thanks To • The Library of Congress Thomas:http://thomas.loc.gov/ • Office of the Clerk, U. S. House of Representatives:http://clerk.house.gov/ • Del.icio.us: http://del.icio.us