280 likes | 464 Views
libwww - The W3C Protocol Library. „Großes Schwerpunktseminar WI“ University of Applied Sciences Gießen-Friedberg Stefan Sabatzki. Contents. Introduction Structure libwww Programming with libwww Conclusion. Contents. Introduction What is libwww? Why libwww? Structure libwww
E N D
libwww -The W3C Protocol Library „Großes Schwerpunktseminar WI“University of Applied Sciences Gießen-Friedberg Stefan Sabatzki Libwww, the W3C protocol library 29.06.2004
Contents • Introduction • Structure libwww • Programming with libwww • Conclusion Libwww, the W3C protocol library 29.06.2004
Contents • Introduction • What is libwww? • Why libwww? • Structure libwww • Programming with libwww • Conclusion Libwww, the W3C protocol library 29.06.2004
What is libwww? • Generic framework for building web applications • Written in C • Pluggable modularity • Means to provide most common Internet access methods • Transmit data in many different media formats • Dataflow to and from the server Libwww, the W3C protocol library 29.06.2004
What is libwww? (2) • First version implemented 1992 by Tim Berners-Lee • Development at CERN • 1994 libwww moved from CERN to W3C • 1998 released as opensource • As of September 2003 W3C stopped work on libwww • As of January 2004 libwww officially belongs to the „Open Source Community“ Libwww, the W3C protocol library 29.06.2004
Why libwww? • Experimenting and prototyping • Performance, modularity and extensibility • Free and open source code • Mailing lists and active community Libwww, the W3C protocol library 29.06.2004
Contents • Introduction • Structure libwww • Design Model • Request/Response Paradigm • Data Flow • Threads, Eventloops and Filters • Modules as Statemachines • Programming with libwww • Conclusion Libwww, the W3C protocol library 29.06.2004
Design Model • Layering as design model Libwww, the W3C protocol library 29.06.2004
Design Model (2) • More demonstrative Libwww, the W3C protocol library 29.06.2004
Request/Response Paradigm • Application issues request • Libwww fulfills request • Presented to application on arrival • Simultaneous requests handled by Librarycore Libwww, the W3C protocol library 29.06.2004
Data Flow • Streams are used to transport data • Derived from generic stream • Protocol streams • Converters • Presenters • I/O streams • Basic streams Libwww, the W3C protocol library 29.06.2004
Data Flow (2) • Structured streams • Derived from generic stream • Accepts structured document • Ordered tree-structured arrangement of data • Each instance is associated with SMGL parser • Each instance is associated with corresponding DTD Libwww, the W3C protocol library 29.06.2004
Data Flow (3) • Cascaded streams • Stream chains • Setup before data arrives Libwww, the W3C protocol library 29.06.2004
Data Flow (4) • Setup after data arrives Libwww, the W3C protocol library 29.06.2004
Threads, Eventloops and Filters • Not thread-save • Implements pseudo-thread model • Uses non-blocking sockets • Based on callback functions • Before/After-Filter • Global and local filters • Registered at runtime Libwww, the W3C protocol library 29.06.2004
Threads, Eventloops and Filters (2) Libwww, the W3C protocol library 29.06.2004
Modules as Statemachines • Since libwww 3.0 • Protocol modules implemented as statemachines • Part of thread-model • Keep track of current state in communication interface Libwww, the W3C protocol library 29.06.2004
Modules as Statemachines (2) Libwww, the W3C protocol library 29.06.2004
Contents • Introduction • Structure libwww • Programming with libwww • C++ Simulation • APIs and Library Interfaces • Simple Example • More Complex Example • Conclusion Libwww, the W3C protocol library 29.06.2004
C++ Simulation • Construction/destruction • *_new / *_delete (HTRequest_new / HTRequest_delete) • Data hiding • Inheritance • Explicit pointer casting • PRIVATE, PUBLIC Makros Libwww, the W3C protocol library 29.06.2004
APIs and Library Interfaces • Set of APIs called packages • Win32: DLLs • Unix: separate static libraries • Package interface exported via single include file: WWW*.h • Some important packages • Basic Utility Packages • Core Packages • Initialization Packages • Transport Packages • Protocol Packages • Parser Packages Libwww, the W3C protocol library 29.06.2004
Simple Example • Displays all links in document • Applicable to text, html/xml tags, etc.// snippet ... HText_registerLinkCallback(foundLink); . HTEventList_loop(request); ... foundLink (...) { HTAnchor * dest = HTAnchor_followMainLink(...); char * address = HTAnchor_address(dest); HTPrint("Found link `%s\'\n", address); HT_FREE(address); } Libwww, the W3C protocol library 29.06.2004
More Complex Example • Rudimentary commandline browser • See project www.dsw Libwww, the W3C protocol library 29.06.2004
Contents • Introduction • Structure libwww • Programming with libwww • Conclusion • What‘s missing? • Facts about libwww • Personal Opinon Libwww, the W3C protocol library 29.06.2004
What‘s missing? • Not thread-safe • No cookie-jar, only parsing/generation • Consistent usage of RegEx • C++ representation Libwww, the W3C protocol library 29.06.2004
Facts about libwww • Who uses libwww? No one? • Sample applications on project homepage • No reviews, benchmarks, comparisons • Not ‚bug free‘ • ‚Competitors‘ (mostly UNIX) • WinInet • Libghttp • Libcurl • Libhttp • Neon Libwww, the W3C protocol library 29.06.2004
Personal Opinion • Typical opensource project • Tricky installation • ‚Feels‘ old < – > IS old • Desperate attempt to reach OOP • Non-trivial usage, but very flexible and potent Libwww, the W3C protocol library 29.06.2004
Thank you for your attention ? Libwww, the W3C protocol library 29.06.2004