330 likes | 344 Views
Explore the rationale and key elements of the Semantic Web, understanding the importance of semantics and smart search software. Discover how XML, RDF, Ontologies, and Intelligent Agents contribute to creating a more interactive and meaningful web experience. Discover how the Semantic Web can revolutionize data management and information retrieval, leading to more efficient and precise web searches.
E N D
IMS5401Web-based Systems Development Topic 2: Elements of the Web (g) Metadata and meaning (the semantic web) (h) Interactivity
Agenda • Topic 2 (g): The Semantic web: Rationale • Key elements of the semantic web • Implications for system developers • Topic 2 (h) Web interactivity • Types of web interactivity • Implications for web developers
Elements of the Web Connecting computers THE WEB Display and organisation of documents Digital representation of documents Linking documents
Topic 2(g): The Semantic Web: Rationale • The imprecision/variability of language • The importance of context in human communication and information • The inadequacy of computers in dealing with human communication • Topic-based searches in search engines • Attempted solutions and their failings • Metadata • “Smart (er)” search software • Semantics = the study of meaning; hence the semantic web
Including semantics in the web • Existing web elements: lots of information content, but little semantic content (HTML) • Therefore the user must interpret context and meaning to identify relevant pages; (problems for computers) • Want to include the meaning of information as well as the information • Computer-based searches of the web must be able to match the meaning of the query with the meaning of the web information
Changing the web • Should we change the web or build a second version? • Existing web works and has too much existing content to be re-built • Therefore, continue to use existing web as the method for storing and accessing documents • Existing web is poorly-designed for precise searching and data-centred transactions • Therefore, build semantic web separately from it • (Can this approach work?)
Databases Limited/defined scope Controlled Structured Precisely-defined data Structured queries (SQL) The web Unlimited Uncontrolled Unstructured Poorly-defined (undefined?) data Unstructured queries (Google, etc) The web as a database
2. Key elements of the semantic web • The semantic web concept has many theoretical elements. Key ones include: • Metadata, HTML and XML • XML and XML Schema • Resource Description Framework (RDF) and RDF Schema • Ontologies and OWL • Intelligent agents and computerised searches • Digital signatures • Level of theoretical and practical development of these elements is very variable
Metadata, HTML and XML • Remember the limitations of HTML for storing metadata? • XML (eXtensible Mark-up Language) gives improved mark-up capabilities to enable data in documents to be tagged to incorporate meaning • XML enables all page elements to be tagged • XML enablesweb page creators to define their own tags to define page elements which don’t fit ‘normal’ tags
XML and XML Schema • XML tags to define meaning are not enough for true database functionality; need uniformity in document structures • XML Schema is a language for defining standard structures to XML documents • Originally implemented in XML (copied from SGML) with document type definitions (DTDs) • DTDs are generally too limited; hence other schema languages like XML Schema • Document is associated with a schema and can be checked for validity
RDF and RDF Schema • RDF is a data modelling language used to identify specific types of objects and relationships between them in a form which can be represented in an XML document • RDF uses Uniform Resource Indicators (URIs) to identify objects and relationships between them • eg <Martin> <created> <slides> can be represented as a set of URIs • RDF Schemadefines object structures which enable checking that a given RDF uses valid objects and relationships
Ontologies and OWL • An ontology defines what things (objects) can exist in the world • Create ontologies which provide standard terminologies for objects and their properties • Ontologies can be cross-referenced from XML documents to relate objects being described in different documents • OWL: Web Ontology Language for creating an ontology • Dublin Core = one example of an ontology
“Intelligent” agents and computerised searches • Semantic web envisages a high level of automated (machine-based) searches for objects • Agents: software which can do the searching, checking and retrieval of data objects on the semantic web • Agents can be user-driven (function on command) or built-in to physical devices (function automatically when triggered by some signal)
Digital signatures • How do we (or our agents) decide on the reliability of the information we get from the semantic web (ie who provided it)? • Digital signatures can be attached to documents to verify who created them • Agents can check the digital signature of the information they get and verify with the user that this is a satisfactory source • Digital signatures will be encrypted to protect from fraud
Others • Lots and lots of variants of these plus different forms of implementation of them • Heavy overlap with the artificial intelligence community. Replace search for intelligent machines with search for means for embedding meaning into documents, so they can be accessed ‘intelligently’ by dumb machines • See tute material for a bewildering variety of initiatives/ideas/products
3. Implications for Systems Developers • Attempts to structure and classify knowledge have gone on for thousands of years. Level of success? … depends on your point of view! • Developers of web metadata and the semantic web are trying to repeat this work on an even larger range of media and information types • The anarchic nature of the web means that the standards people can work only by persuasion • Who will be willing to conform, and for what purposes?
Using the web • Can the web be made to function as: • A network of documents? • A network of indexed and classified documents? • A network of linked data-containing objects? • A network of “intelligent” data-containing objects structured to work like a giant database? • How achievable are each of these possibilities? • What does this mean for web usage and applications? • What does it mean for web development?
4. Topic 2 (h): Web Interactivity • We are used to computer systems which are designed to be interactive: • prompt us with options about what is possible • accept and store input from us as users • provide output as required • Eg automatic teller machine, reservation system, etc • System has a programmed interface which changes in response to our input, and which allows us to enter queries or data for storage • How well can the web support interactivity?
Interactivity in a ‘normal’ system User Interface (prompts user for input -query or new data for database; Displays output) Database New data/ query details User (makes query or provides inputto database) Input Output Response to query
Programming and interactivity in ‘normal’ systems • User(client) creates connection to host machine (server) on which programme is running • Connection remains ‘live’ while user session is running • User is prompted by programme and provides input which the programme accepts and responds to • Programmimg languages enable an extremely wide range of types of user input and machine response
5. Web interactivity • Ideal model of dynamic web site content and user input: Up-dated database content to display Database Web Page Prompt for input User User input User input to up-date database
Actual interactivity in web-based systems • The web has several features in its design which limit its ability to handle interactivity • It is designed around pages, not data • It is based on static, not dynamic page content • HTTP establishes no on-going connection between the client machine and the server machine. Therefore, requests for information by the client and responses and information from the server are passed separately and independently from one another • Interactivity is much harder to achieve - restricted and more messy
Web Interactivity: Two types • Client-side interactivity: designed to change the appearance or behaviour of the web page in response to the user’s input (ie left-hand side of earlier diagram) • Server-side interactivity: designed to enable a user’s input to be taken by web page and sent back to the server web site to be used as input to another program running on the server (ie right hand side of earlier diagram) • In both cases the interaction is done via scripts (a form of programming language)
5(a) Web interactivity: Client-side • Used to modify the way in which a page is displayed to the user • The script which does the work is embedded in the web page HTML • A variety of HTML tags allow you to use scripts in various ways to get different effects • Interface features commonly created using scripts include mouse roll-overs, animation, creation of dialogue boxes to accept user input and modify the way a page or its elements is displayed, etc
Scripting languages for client-side interactivity • The original was Java Script (note, nothing to do with the Java programming language!) • ECMAscript is a W3C ‘standard’ for scripting, based around Java script • The Java programming language can be used to create applets, which are small programs embedded in the web page HTML - same effect as Java Script • Java Script can be run by the browser; Java applets need the Java plug-in to run them
More fun with the marketplace • Netscape developed the first version of Java Script for the Netscape browser • Microsoft followed a couple of years later with their own version, called J Script for the IE browser • The two languages are similar, but not identical to each other NOR to the W3C ‘standard’ ECMAscript! • More problems for web page developers who want their page to work the same in any browser!
5(b) Web interactivity: Server-side • Used to enable a user to enter information on a web page and send it as input to a program running on the server (eg a database program) • Any web system, must use a server-side script to get the user input to the main program and send a reply. For example: • a system which stores transaction details (order, payment, etc) in a database on a server; • any web interface set up to enable a user to interact with a system (eg a web-based user query of the Monash library catalogue)
Server-side scripts and CGI (Common Gateway Interface) • CGI is a protocol for enabling exchange of data between a web page and a program running on the server • CGI scripts run on the server; activated by user input to a web page form • A CGI script can provide web form input data to another program running on the server (eg a database) • It can also accept data from a program on the server and provide it to the web page • Write CGI scripts in many scripting languages - Perl, etc
Cookies: A way of achieving server-side interactivity • A cookie is a text file which a server sends you with the web page you requested • The cookie is stored by your browser; sometimes temporarily and sometimes permanently (or until you remove it) • The cookie stores information about the user computer or user input to the web page • When the browser asks for another page from that server, information in the cookie is sent to the server along with the request for the page
Using cookies: A simple example • When you buy something from Amazon.com: • Amazon puts a cookie on your machine to identify you • Amazon records your details, your purchase and your cookie details in its sales database • Next time you ask your browser to get you the Amazon web site: • the browser sends your customer number from the cookie back to a CGI script on the Amazon server • The CGI script checks you on the customer database and uses it to personalise the page it sends you
6. Implications for web developers (1) • Building either client-side or server-side interactivity needed programming skills to write the scripts • Programming languages are not well-structured to deal with web interactivity needs • A range of scripting languages have been developed and new scripting capabilities built into HTML, new plug-ins for browsers, etc • Now we have many different options for writing scripts - Microsoft’s Active Server Pages, PHP, Perl, Python, etc etc …. Very confusing!
Implications for web developers (2) • As with writing HTML, products have also been developed to generate scripts for database interactivity, so you don’t have to program • However, these products apparently generate code like HTML generators generate HTML - yes, it’s there, but does it work? … and with what browsers? • Therefore, texts still recommend you write it yourself, or at least patch it yourself
Summary • The web is not well designed for dealing with interactivity • It’s messy, inefficient and limited in flexibility (compared to ‘normal’ system connectivity and interactivity) • It requires significant technical skills (programming, etc) which are not easily learned (unlike HTML, for example) • Products which make it easier are coming, but they still fall well short of making it easy • Implications for web systems?