260 likes | 475 Views
Bits about Bits: Bitzi and Open, Cooperative Metadata . Gordon MohrBitzi CorporationFounder
E N D
2. Bits about Bits: Bitzi and the Business of Metadata Gordon Mohr
Bitzi Corporation
Founder & Chief Technology Officer
September 17, 2001
3. Bits about Bits: Bitzi and Open,Cooperative Metadata Gordon Mohr
Bitzi Corporation
Founder & Chief Technology Officer
November 7, 2001
4. Overview P2P File Sharing: “a cornucopia without confidence”
Four Missing Ingredients
The Bitzi Approach
Demos
Future Directions
Could metadata be a big business?
5. Everything is now “Bits”… Anything can be encoded, stored, shifted, shared
Thc “cloud” is coming to include everything
Tech and social trends are against strict control
6. No Confidence or Context You can get anything imaginable, BUT…
Is it complete? Where did it originate?
Has it been damaged or altered?
Is this the best or current instance?
What’s related? Is it legitimate?
What should I seek next?
Current ad hoc & P2P sharing/distribution nets inherently blur these issues
Filename-centric
Mr. Short-Term Memory
7. What’s Missing? We’re craving four things:
Reliable Names
Nothing can masquerade as something else
Easy to ask for exactly the right thing
Rich Metadata
Beyond just “filename” and “length”
Easy Access
Everywhere the files are, and then some
A Consensus View
Eliminate frivolous skew of understanding
8. We Want: Reliable Names Does a file have a “True Name”?
Yes, via Cryptographic Hashes
Essentially, these are “digital fingerprints”
Any-sized input (any digital file) to fixed-sized output (hash value)
Deterministic but “unpredictable”
Infeasible to create specific desired hash value
Infeasible to find two inputs with same hash value
Examples:
MD5 (but maybe not as reliably as once thought)
SHA1 (and now SHA256, SHA512)
Tiger
RIPEMD160
9. We Want: Rich Metadata Metadata is “Data about other Data”
“Filename” and “Length” are a trivial start
Intrinsic or extrinsic to file itself
Examples
Generic: Origin, Free-form description, Comments, Community Ratings
Format-specific: Encoding parameters, Resolution, Playback length
Growing body of useful standards and conventions
XML, RDF, Dublin Core, domain-specific proposals
10. We Want: Easy Access Ubiquity
Anywhere the files are – and where they’re not
Simplicity
Familiar interfaces
Reliability
Canonical location
Redundant Mirrors
Multiple paths – same paths as files
11. We Want: A Consensus View Avoid redundant efforts
Achieve convergence on simple issues
Trivial disagreements and mistakes should be quickly and permanently resolved
Robustness against casual mischief
Capture and highlight enduring disagreements
Even arbitrary commonality is valuable
Naming systems
A central “reference point” is the easy solution
12. The File Trust Utility
13. The Bitzi Approach A metadata aggregator, consisting of…
Website
Community of contributors
Editorial/rating policies
Canonical datastore
Web service
Free access and reuse
Just give us attribution
Other restrictions only get in the way
Our long-term role: stewardship
We live or die by the usefulness of the dataset
14. Sources of Inspiration Open Directory Project
AKA NewHoo, GnuHoo, DMoz(illa)
Volunteer-built Yahoo-like categorical web index
CD/Music projects
CDDB (before dataset lockdown)
FreeDB & MusicBrainz (since)
Oxford English Dictionary
“The Professor and the Madman”
Naspter et al
De facto quality filtering
Usenet (esp. FAQs), Epinions, Amazon reviews, EBay, Zagat’s
15. How Bitzi Works: Bitprints & Tickets
16. How Bitzi Works: Tickets Out
17. How Bitzi Works: Tech Details Our “Bitprint”
Master key into our catalog
Concatenation of two nonproprietrary hashes
SHA1: safe, standard
TigerTree: different basis, range benefits
Robustness against research breakthroughs
Our data model & terminology
Bitprints may be “tagged”
Tags are arbitrary XML blobs
Growing set of types
Usually coercible into a database row or RDF
Tags “compete” with each other as necessary
“Tickets” are created from the best tags
18. How Bitzi Works: Current tools
Data collection
Downloadable “Bitcollider” utility
Windows & Linux
Free source code
Calculates bitprint, extracts some intrinsic tags
Web forms
Viewing/rating/searching
All at our website
19. How Bitzi Works: Open Code & Data
Bitcollider & bitprinting code available
Public Domain
C & Java
Free dataset access: “OpenBits”
Draft OpenBits License based on Open Directory Project license
Preliminary RDF dump available
http://preview.openbits.org
Eventually, at the Ticket granularity
20. Using Bitzi On your desktop:
Identify anything you’ve got – including possible problems, newer versions, etc.
At our website:
Find interesting potential new things to get – in context, presented alongside other options
In other applications, devices, websites:
Identify what’s playing
Choose between offered options
Organize/correct your collection
Much more… ?
21. Demos Bitzi Bitcollider
Desktop utility
LimeWire
Evaluate search results before downloading
WinAmp
See more about “what’s playing”
Bitzi Website
Search for new items of interest
22. Future: Greater Integration
Standard, generic “get” facility
We expect: single-click from Ticket asks multiple applications to locate matching file
Ticket info inside applications
Get Ticket direct from Bitzi, or elsewhere
Verify Ticket validity (cryptographically signed)
Display as locally appropriate
23. Future: Website and Community
Enhanced search
Improved rating and peer-review processes
Browsing/Categorization
Automatic and manual
Dataset mining
Variety of rankings
24. Is this a Business? Not all Tickets are (or should be) equal
Fuzzy vs. guaranteed trust
Community vs. promotional info
Attention is always scarce
Some special inserts will cost
Someone always needs to be found & trusted
Users benefit:
Fees subsidize verification procedures
Prices self-select for appropriateness
Has anyone succeeded with “free lookups, paid inserts?” (Yes; examples should be obvious)
25. The End Gordon Mohr
Founder & Chief Technology Officer
Bitzi Corporation
Email gojomo@bitzi.com
Bitizen Page http://bitzi.com/bitizen/gojomo
O’Reilly Webloghttp://www.oreillynet.com/weblogs/gojomo