130 likes | 261 Views
A Rule Based System For Web Site Verification. 1 Dip. Matematica e Informatica, Via delle Scienze 206, 33100 Udine, Italy. Email: demis@dimi.uniud.it . 2 DSIC, Universidad Politécnica de Valencia, Camino de Vera s/n, Apdo. 22012, 46071 Valencia, Spain. Email: jgarciavivo@dsic.upv.es.
E N D
A Rule Based System For Web Site Verification 1 Dip. Matematica e Informatica, Via delle Scienze 206, 33100 Udine, Italy. Email: demis@dimi.uniud.it. 2 DSIC, Universidad Politécnica de Valencia, Camino de Vera s/n, Apdo. 22012, 46071 Valencia, Spain. Email: jgarciavivo@dsic.upv.es. D. Ballis1 J. García-Vivó2
Motivations • Web Sites can have a very complex structure • Development and maintenance of Web sites are difficult tasks • Suitable methodologies are needed to verify Web sites w.r.t. syntactic as well as semantic properties!
Formal Verification of Web Sites • GVerdi, the graphical evolution of the VERDI system, implements: • a rule-based specification language for specifying properties of Web sites • a verification technique for automatically checking whether the conditions are fulfilled and helping to repair faulty Web sites • it allows to detect incorrect and incomplete/missing Web pages
Web Site Denotation • A web page is a ground term. Consequently, we represent a Web Site as a finite collection of ground terms of a suitable term algebra <member> member( <name> Peter </name> name(“Peter”) <surname> Hawkins </surname> surname (“Hawkins”) <status> Professor </status> status (“Professor”) <teaching> teaching( <course> Algebra </course> course (“Algebra”) </teaching> ) </member> )
Specification Language • A Web specification is a triple (R, IN , IM) where • R is a set of user´s function definitions • IN is a set of correctNess rules • IM is a set of coMpleteness rules
Specification LanguageCorrectness Rules • A correctness rule has the following form: l -> error | C where - l is a term - error is a reserved constant - C is a sequence of equations and membership tests w.r.t. regular languages e.g. project(year(X)) -> error | X in [0-9]*, X<1990
Specification LanguageCompleteness Rules • A completeness rule has de following form: l -> r where - l is a term - r is a term s. t. Var(r)⊆Var(l) e.g. member(name(X),surname(Y))->hpage(name(X),surname(Y),status())
Specification LanguageCompleteness Rules • Some symbols in the right-hand side of the rules can be marked by symbol # • Marks are used to select the Web pages on which we want to check a given condition. l -> #r e.g. hpage(status(“Professor”) -> #hpage(#status(#“Professor”),teaching)
Rule Interpretation • Correctness Rules l -> error | C if l is recognized in some Web page of W and all the expressions represented in C are evalutaed to True (or C is empty), the web page is incorrect • Completeness Rules l -> #r if l is recognized in some Web page of W, then r must be recognized in some Web page of W which contain the marked part ofr
An Example… … first of all, some GVerdi details to be considered: *XML/XHTML Web Pages * Function definitions of R are written in Haskell and saved in the file Defs.hs. Haskell functions which are defined in the Prelude standard library are available too. *wxHaskell-0.8, RegexpLib98 and hxml-0.2
Conclusions • We presented the system GVerdi • Written in Haskell • 1100 lines of code (approx.) • Parser for web pages and specifications • Partial Rewriting, Verification Technique and Graphical User Interface • We are able to detect missing/incomplete and incorrect Web pages efficiently
Future Work • Improving Regular Expressions capabilities • Verification of full XML code (namespaces, attributes, etc.) • Moving towards the verification of dynamic Web sites