280 likes | 352 Views
Refactoring HTML. Elliotte Rusty Harold elharo@metalab.unc.edu http://www.cafeconleche.org/. Why Refactor. What to Refactor To. XHTML CSS REST. Move Away From. Tag soup Presentation based markup Stateful applications. XHTML. CSS. REST. All resources are identified by URLs.
E N D
Refactoring HTML Elliotte Rusty Harold elharo@metalab.unc.edu http://www.cafeconleche.org/
What to Refactor To • XHTML • CSS • REST
Move Away From • Tag soup • Presentation based markup • Stateful applications
REST • All resources are identified by URLs. • Safe, side-effect free operations such as querying or browsing operate via GET. • Non-safe operations operate via POST. • Each request is independent of all others.
The Refactoring Process • Identify the problem. • Fix the problem. • Verify that the problem has been fixed • Check that no new problems have been introduced. • Deploy the solution.
Things Can Go Wrong • Backups • Staging Servers • Source Code Control
Validators • W3C Markup Validation Service • LogValidator • Xmllint • Editors: DreamWeaver, BBEdit, etc.
Testing • HTMLUnit • JsUnit • HTTPUnit • jWebUnit • Fitnesse • Selenium
Regular Expressions • Learn them! • But be cautious • Prefer parser-based solutions
Tidy • C (and PHP) • Custom API • Can handle most bad markup • Usually produces well-formed XHTML • Often produces valid XHTML • $ tidy -asxhtml -m index.html
TagSoup • Java and SAX • Can Handle Anything • Always well-formed • May not be valid • $ java -jar tagsoup.jar --encoding=ISO-8859-1 index.html
Well-formedness Defined • Every element has one parent elemnet; no overlap • Every start-tag has a case-sensitive matching end-tag • Attribute values are quoted • Entity references are defined • +Namespaces
Well-formedness Refactorings • Make name lower case • Quote attribute value • Replace empty tag with empty-element tag • Add end-tag • Eliminate overlap • Convert text to UTF-8 • Escape < and & • Introduce an XHTML DOCTYPE • Introduce the XHTML namespace
Validity Defined • The document has a DOCTYPE <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "/dtds/xhtml1-transitional.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd"> • The document adheres to constraints expressed in the DTD • Nothing that’s not in the DTD • Not as important as well-formedness
Validity Defined • The document has a DOCTYPE <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "/dtds/xhtml1-transitional.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd"> • Document adheres to constraints expressed in the DTD
Validity Refactorings • Introduce Transitional DOCTYPE • Introduce Strict DOCTYPE
Transitional • Eliminate bogons • Add alt attributes
Srict • Replace center, b, i, font, etc. with CSS • Nest inline elements in block elements
Layout • Wrap related information in divs • Add ID attributes • Replace table layouts with CSS • Replace frames with CSS positions • Put the content first • Markup lists as lists • Replace blockquote/ul indentation with CSS • Replace spacer GIFs
Accessibility • Convert images to text • Add labels to forms • Standard names for input fields • Add tab indexes to forms • Add skip navigation • Add internal headings • Provide captions, summaries, and headers for tables • Identify acronyms
Web Applications • Replace GET with POST • Replace POST with GET • Replace Flash with HTML • Make web apps cache savvy • Provide Etags • Add Web Forms 2.0 Types • Block robots • Avoid SQL injection
Content • Check spelling • Check links • Restructure sites but keep the URLs • Remove entry pages • Hide e-mail addresses from spambots
Objections To Refactoring • We don’t have the time to waste on cleaning up the code. We have to get this feature implemented now! • Refactoring saves time in the long run. • You have more time than you think you do.
Further Reading • Refactoring HTML: Elliotte Rusty Harold • Refactoring: Martin Fowler • Designing with Web Standards:Jeffrey Zeldman • The Zen of CSS Design: Dave Shea & Molly Holzchlag