150 likes | 164 Views
Web Databases and XML. Traditional DB Applications. Typically business oriented Large amount of data Data is well-structured, normalized, with predefined schema Large number of concurrent users (transactions) Simple data, simple queries, and simple updates Typically update intensive
E N D
Traditional DB Applications • Typically business oriented • Large amount of data • Data is well-structured, normalized, with predefined schema • Large number of concurrent users (transactions) • Simple data, simple queries, and simple updates • Typically update intensive • Small transactions • High performance, high availability, scalability • Data integrity and security are of major importance • Good administrative support, nice GUIs
Internet Applications Challenges: • Use heterogeneous, complex, hierarchical, fast-evolving, unstructured/semistructured data • Access mostly read-only data • Need 100% availability • Manage millions of users world-wide • Have high-performance requirenments • Are concerned with security (encryption) • Like to customize data in a personalized manner • Expect to gain user’s trust for business-to-consumer transactions. Internet users choose speed and availability over correctness
Electronic Commerce • Currently, mostly business-to-business (B2B) rather than business-to-consumer (B2C) interactions • Focus on selling and buying: • Order management • Product catalogs • Product configuration • Sales and marketing • Education and training • Service • Communities
Other Web Applications • Web integration • Heterogeneous data sources and types • Thousands of web-accessible data sources • Dynamic data • Data warehouses • Web publishing • Access different types of content from browsers (eg, email, PDF, HTML, XML) • Structured, dynamic, customized/personalized content • Integration with application • Accessible via major gateways and search engines • Application integration • Transformation between different application data formats (eg, XML, HTML) • Integration of multiple applications
Current Internet Application Architectures Architecture: • Server-Tier: relational databases and gateways to diverse data sources, such as, files, OLE/DB etc. Use of enterprise servers • Middle-Tier: provides data integration & distribution, query, etc. Consists of a web server and an application server • Client-Tier: mostly a web browser, may use CGI scripts or Java Characteristics: • Customization is achieved at the server site (customer data in a database) with some data at the client site (cookies) • Load balancing is typically hardware based (multiple servers, DNS routers)
XML XML (eXtensible Markup Language) is a textual language for representing and exchanging data on the web. It is designed to improve the functionality of the Web by providing more flexible and adaptable information identification. • Based on SGML and was developed around 1996. • It is called extensible because it is not a fixed format like HTML (a single, predefined markup language). Instead, XML is actually a metalanguage -- a language for describing other languages -- which lets you design your own customized markup languages for limitless different types of documents. • XML can be untyped (semistructured), but there are standards now for schema conformance (DTD and XML Schema) • Without schema, an XML document is well-formed if it satisfies simple syntactic constraints: proper nesting of start and end tags.
XML Syntax • XML documents conform to the following grammar: XMLdocument ::= Pi* Element Pi* Element ::= Stag (char | Pi | Element)* Etag Stag ::= '<' Name Atts '>‘ Etag ::= '</' Name '>‘ Pi ::= '<?' char* '?>‘ Atts ::= ( Name '=' String )* String ::= '"' char* '"‘ • XML consists of tags and text. • Tags come in pairs <date>8/25/2001</date> and must be properly nested: <person> <name> ... </name> ... </person> --- valid nesting <person> <name> ... </person> ... </name> --- invalid nesting • Text is bounded by tags. PCDATA: parsed character data. For example, <title> The Big Sleep </title> <year> 1935 </ year>
Representing Data Using XML • Nesting tags can be used to express various structures, such as a record: <person> <name> Ramez Elmasri </name> <tel> (817) 272-2348 </tel> <email> elmasri@cse.uta.edu </email> </person> • We can represent a list by using the same tag repeatedly: <addresses> <person> ... </person> <person> ... </person> <person> ... </person> ... </addresses> • An opening tag may contain attributes. These are typically used to describe the content of an element: <author id="2787901">Philip A. Bernstein</author>
XML structure XML: <person> <name> Ramez Elmasri </name> <tel> (817) 272-2348 </tel> <email> elmasri@cse.uta.edu </email> </person> is Lisp-like: (person (name “Ramez Elmasri”) (tel “(817) 272-2348”) (email “elmasri@cse.uta.edu”)) and tree-like: person name tel email Ramez Elmasri (817) 272-2348 elmasri@cse.uta.edu
Complete Example <?xml version="1.0"?> <!DOCTYPE bib SYSTEM "bib.dtd"> <bib> <vendor id="id0_1"> <name>Amazon</name> <email>webmaster@amazon.com</email> <phone>1-800-555-9999</phone> <book> <title>Unix Network Programming</title> <publisher>Addison Wesley</publisher> <year>1995</year> <author> <firstname>Richard</firstname> <lastname>Stevens</lastname> </author> <price>38.68</price> </book> <book> <title>An Introduction to Object-Oriented Design</title> <publisher>Addison Wesley</publisher> <year>1996</year> <author> <firstname>Jo</firstname> <lastname>Levin</lastname> </author> <author> <firstname>Harold</firstname> <lastname>Perry</lastname> </author> <price>11.55</price> </book> … </vendor> </bib>
DTD: Document Type Definition <?xml encoding="ISO-8859-1"?> <!ELEMENT bib (vendor)*> <!ELEMENT vendor (name, email, book*)> <!ATTLIST vendor id ID #REQUIRED> <!ELEMENT book (title, publisher?, year?, author+, price)> <!ELEMENT author (firstname?, lastname)> <!ELEMENT name (#PCDATA)> <!ELEMENT email (#PCDATA)> <!ELEMENT title (#PCDATA)> <!ELEMENT publisher (#PCDATA)> <!ELEMENT year (#PCDATA)> <!ELEMENT firstname (#PCDATA)> <!ELEMENT lastname (#PCDATA)> <!ELEMENT price (#PCDATA)>
Referencing Elements Using IDs/IDrefs <family> <person id="jane" mother="mary" father="john"> <name> Jane Doe </name> </person> <person id="john" children="jane jack"> <name> John Doe </name> <mother/> </person> <person id="mary" children="jane jack"> <name> Mary Doe </name> </person> <person id="jack" mother=”mary" father="john"> <name> Jack Doe </name> </person> </family>
OODB Schema class Movie ( extent Movies, key title ) { attribute string title; attribute string director; relationship set<Actor> casts inverse Actor::acted_In; attribute int budget; } ; class Actor ( extent Actors, key name ) { attribute string name; relationship set<Movie> acted_In inverse Movie::casts; attribute int age; attribute set<string> directed; } ;
In XML … <db> <movie id=“m1”> <title>Waking Ned Divine</title> <director>Kirk Jones III</director> <cast idrefs=“a1 a3”></cast> <budget>100,000</budget> </movie> <movie id=“m2”> <title>Dragonheart</title> <director>Rob Cohen</director> <cast idrefs=“a2 a9 a21”></cast> <budget>110,000</budget> </movie> <movie id=“m3”> <title>Moondance</title> <director>Dagmar Hirtz</director> <cast idrefs=“a1 a8”></cast> <budget>90,000</budget> </movie> <actor id=“a1”> <name>David Kelly</name> <acted_In idrefs=“m1 m3 m78” > </acted_In> </actor> <actor id=“a2”> <name>Sean Connery</name> <acted_In idrefs=“m2 m9 m11”> </acted_In> <age>68</age> </actor> <actor id=“a3”> <name>Ian Bannen</name> <acted_In idrefs=“m1 m35”> </acted_In> </actor> : </db>