1 / 53

Subproject 4: HTML-WML Transcoding System

Subproject 4: HTML-WML Transcoding System. Jia-Shung Wang Computer Science Department National Tsing Hua University M arch 27, 2001. Outline. Motivation and Issues Examples of Transcoding System Overview and Translation Flow Some HTML to WML Conversion Strategies.

shelly
Download Presentation

Subproject 4: HTML-WML Transcoding System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Subproject 4: HTML-WML Transcoding System Jia-Shung Wang Computer Science Department National Tsing Hua University March 27, 2001

  2. Outline • Motivation and Issues • Examples of Transcoding • System Overview and Translation Flow • Some HTML to WML Conversion Strategies

  3. Information Appliances • Different design constraints based on intended use, enhances ease of use • Desktop PC • Mobile PC • Desktop “Smart” Phone • Mobile Telephone • Personal Digital Assistant • Set-top Box • Digital VCR • … • Implications: • Shift from computer design to consumer design • Heterogeneous “standards,” hybrid networking • Interactive networking, access on demand, QoS

  4. Motivation • Rapidly growing diversity of wireless communication devices • The incredible growing of the amount of available HTML web pages on the Internet • Solutions for mobile devices with WML browsers to access the existing HTML or WML pages on the Internet.

  5. Issues • Device-enabled service for WML mobile devices with different types of screen • Bandwidth-driven transmission for rapid response and fast delivery speed • The usage of browsing behavior • The resizing of images /icons • The compression of the resulting WML data

  6. Demos of Transcoding • Contents from enYES 鉅亨網 USAtoday CS, NTHU NTHU VOD

  7. Discussions • enYES provides two versions: regular HTML and WAP to serve PC users and mobile device users separately. • USAtoday also provides content (simplified version) for users with Palm. • NTHU, CS-NTHU homepages:If we keep the original figure for saving the link information, then the page layout becomes old. (using HTML browser with:Browse-It). • VOD homepage, one-column text: no significant difference after transcoding.

  8. Usage of Browsing Behavior • The automatic translation seems complicated because of the diversity of content posted on an HTML page. • It is unlikely to have a universal conversion strategy to translate every HTML page to sequences of WML decks effectively. • However, it seems a good idea to categorize the browsing behavior to classify the HTML page to be translated first.

  9. Usage of Browsing Behavior (cont’d) • After doing that we may realize what the client requires. Then we can have a corresponding conversion to extract the acquired content step-by-step and translate them into some predictable and small sized WML documents. • We believe that there would be some adequate conversions for some kinds of web pages after classification.

  10. Related WorksTranscoding Proxy of IBM alphaWorks It has a goal to manager different version of contents with different fidelities and modalities in order to adapt the delivery to different client device.

  11. Related WorksIntel Quick Web Technology • New software capability that helps Internet providers and digital distribution companies increase the delivery speed of Web pages containing photos, drawings and other graphics. • It uses two key techniques, “Compresses” and “Caches”.

  12. Related WorksSpyglass Prism • Spyglass Prism dynamically adapts Web content to match various non-PC devices. • It functions as a proxy server, caches the converted content, and dynamically converting standard HTML to WML.

  13. Related WorksProxy Architecture for Efficient Web Browsing over Cellular Networks • Decreases the access time of browsing WWW in narrow-band wireless environment. • It adopts persistent connection and pipelining technique based on proxy architecture to improve the HTTP process between the client and the proxy server.

  14. Comparisons betweenHTML and WML • Both make use of tags and attributes. • Similar character set, syntax and data types. • Two special elements of WML structure • Deck and Card • Different design goal • HTML: To Publish hypertext on the World Wide Web • WML: For narrow network bandwidth devices with small displays, limited memory and fewer computational resources.

  15. Examples of HTML and WML WML <wml> <deck> <card> <p> <do type="accept"> <go href="#card2"/> </do> This is the first card... </p> </card> <card id="card2"> <p> This is the second card. </p> </card> </deck> </wml> HTML <html> <head> <title> Example page. </title> </head> <body> <h1> This is a headline. </h1> <p> This is a paragraph. </p> </body> </html >

  16. Client CGI Scripts etc. System Overview Web Server Translation Server HTML, WML Documents WAP HTML Parser WML HTTP HTML-WML Translator WML Browser HTTP Multimedia Content WML Generator Etc.

  17. Features • An HTML-WML Translator on the Translation Server • Both HTTP and WAP requests are acceptable. • Java Servlet API compatible • Server- and platform-independent

  18. Translation Server: Components and Flow Network Protocol Proxy Request Request Response Response Link Builder WML Generator HTML Parser Decks & Cards Filter Document Analyzer

  19. Components • Gateway • Accept requests from clients • Return appropriate responses • Proxy Servlet • Get the requested remote documents • Determine to pass or convert • Cache the converted results

  20. Components (cont’d) • HTML Parser • Parse the HTML document as a parse tree • Document Analyzer • Analyze the parse tree • Filter • Filter any objects unnecessary or not supported by the client device • Image/icon resizing

  21. Components (cont’d) • Content Divider • Split a document into multiple, small-size documents • Link Maker • Insert extra links to make small documents reach one another • WML Generator • Produce well-formed WML documents and return them to Proxy Servlet

  22. HTML to WMLConversion Tools • Semi-automatic: • Used for rich HTML documents • The conversion form is designated manually with the help of analysis and editing tools. • The resulting forms are distributed to the gateway servers. • Automatic: • Used for simple documents, such as News and BBS, …

  23. HTML to WMLConversion Strategies • Strategy I: Tables to Lists • Simply removing all layout elements such as table • Let all the contents arrange into only one column with a fixed width • Strategy II: One Table One Deck • Extracting each table to form a deck

  24. HTML to WMLConversion Strategies (cont’d) • Strategy III: Preview First a. One Table One Deck b. Collect all the first card of every deck as preview cards c. Arrange these preview cards to form an preview deck, which will be transmitted first, every preview card will have a link to its corresponding deck

  25. Original Document <content 1_1> <table> <section 1> <content 1_2> <content 2_1> <content 2_2> <section 2> <content 2_3> <content 2_4> <content 2_5> <document> <table> <content 3_1> <content 3_2> <content 3_3> <table> < section 3> <content 3_4> <content 3_5> <content 3_6> < section 4> <content 4_1> <content 3_7>

  26. Tables to Lists <content 1_1> <content 1_2> <deck> <content 2_1> <content 2_2> <content 2_3> <content 2_4> <content 2_5> <document> <deck> <content 3_1> <content 3_2> <content 3_3> <content 3_4> <content 3_5> <deck> <content 3_6> <content 3_7> <content 4_1>

  27. One Table One Deck <content 1_1> <deck> <content 1_2> <content 2_1> <content 2_2> <deck> <content 2_3> <content 2_4> <content 2_5> <document> <content 3_1> <content 3_2> <deck> <content 3_3> <content 3_4> <content 3_5> <content 3_6> <deck> <content 3_7> <deck> <content 4_1>

  28. Preview First <deck> <content 1_2> <content 2_2> <content 1_1> <content 2_3> <content 2_1> <deck> <document> <deck> <content 2_4> <content 3_1> <content 2_5> <content 4_1> <content 3_2> <content 3_3> <deck> <content 3_4> <content 3_5> <content 3_6> <deck> <content 3_7>

  29. Strategy Evaluation • Assuming we have S sections in a document and the document is translated to N WML cards. • Every deck contains at most C cards. • Assuming that the contents in the same tables are similar.

  30. Evaluation of Searching After Translation Tables to Lists One Table One Deck Preview First User Friendly Worst Best Good Average Deck Access Time N/2 S/2 S/2C

  31. Performance Evaluation HTML Pages WML Decks (bytes) Reduction Source (bytes) Images (bytes) Without Images With Images Headers Text Experiment #1 24,359 9,471 176,361 7,440 22.0% 3.5% Experiment #2 17,937 6,137 126,740 11,232 46.7% 7.4% Experiment #3 21,203 8,325 280,727 16,891 57.2% 5.4% Experiment #4 9,568 20,363 17,966 12,062 40.3% 25.2%

  32. Performance Evaluation (Experiment #1: What’s WAP) WAP Forum What’s WAP Preview Preview Deck 1 Deck 2 Deck 3 Deck 1 Deck 3.1 Deck 3.2

  33. Performance Evaluation (Experiment #2: NTHU Web Page) History Current Status NTHU Preview Preview Preview About NTHU Deck 1 Deck 2.1 Deck 3.1 Deck 1 Deck 2.1 Preview Deck 1 Deck 2.2 Deck 3.2 Deck 2.2

  34. Performance Evaluation (Experiment #3, NTHU CS Web Page) NTHU CS Faculty Preview Preview Deck 1 Deck 3.1 Deck 3.3 Deck 3.5 Deck 1 Deck 3.2 Deck 3.4 Deck 3.6

More Related