350 likes | 365 Views
Dive into Java, XML, and distributed computing concepts to design and build distributed systems. Practical, book recommendations, and grading details included.
E N D
Technologies for an Information Age:Building the Distributed Object Gridwith Java and XML (.opennet) Fall Semester 2001 MW 5:00 pm - 6:20 pm CENTRAL (not Indiana) Time Geoffrey Fox and Bryan Carpenter PTLIU Laboratory for Community Grids Computer Science, Informatics, Physics Indiana University Bloomington IN 47404 gcf@indiana.edu uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Abstract of PTLIU Fall 2001 Introductory Lecture • This Foilset contains introductory material on PTLIU Course IT1 for Fall 2001 • Some Aspects of Course Logistics -- all students must go to web site for complete discussion of this • http://aspen.csit.fsu.edu/ptliu (Temporary) • We give an overview of material covered in the course • The Internet is the mostimportant and by far the largest distributed computer system and it has spawned the most remarkable and general purpose software ever seen • So in studying the Internet, we study distributed computing (hardware and software) • After this course, Students should be able to design and build any distributed system • There is not time to give a huge amount of programming experience • We will give a summary of Base Distributed Object Web and Internet Technologies uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Leave Now Unless …… • You are practically minded and wish to learn how to write real software to solve real distributed systems • Your software should work and be documented! • We will (depending on survey of students) cover enough Java to be able build systems but focus will be XML and overall system architecture • You need to be experienced enough in programming to “cope” as tools for server side are quite sophisticated • You must be able to tolerate initial confusion as I am at a new institution and technology new uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Practical Issues: Books • Inside XML, by Steven Holzner, New Riders Publishing; ISBN: 0735710201, November 2000 (4.5 star, #1979 Amazon) • This book is slightly out of date as some key concepts (Schema) were not finalized when book went to press • There are some 280 XML books at amazon.com – 25% are “not yet published” • Core Java 2 is one if the best Java book(s) – chosen from the 1670 available at Amazon. Others are also excellent (e.g. Java How to Program by Paul J. Deitel, Harvey M. Deitel) • Volume 1-Fundamentals (4.5 star, #684 Amazon) • Volume 2-Advanced Features(3.5 star, #3134 Amazon) • The Sun Microsystems Press Java Series (Prentice Hall) • Cay S. Horstmann and Gary Cornell • Vol 1: ISBN: 0130894680 5th Edition December 2000 • Vol 2: ISBN: 0130819344 4th Edition December 1999 uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
More Books • Other specialized books cover JavaScript, Dynamic HTML, Enterprise Javabeans and J2ME (Java 2 Microedition for PDA’s) – for some of this, can use Web • JavaScript Bible, 4th Edition, Gold Edition , Danny Goodman, Hungry Minds, Inc; ISBN: 0764547186, January 2001 (4 star, #19,621 Amazon) • Dynamic Html : The Definitive Reference by Danny Goodman, O'Reilly & Associates; ISBN: 1565924940, August 1998 (4.5 star, #1190 Amazon) • Enterprise Javabeans by Richard Monson-Haefel, O'Reilly & Associates; ISBN: 1565928695, March 2000 (4.5 star, #889 Amazon) • Java 2 Micro Edition (Professional Developer's Guide Series) by Eric Giguere, John Wiley & Sons; ISBN: 0471390658, November 2000 (4.5 star, #5488 Amazon) uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Grading and Support • Course Assistant is Xi RaoComputational Science and Information Technology • We will use a web-linked database (built by previous students of this class sequence at Syracuse using technologies you are learning) • Grade will be based on about 7 homework sets. The first of these will be a report and the last a major project which will be larger than for IT1. The rest will involve various practical activities in XML and Java uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Overview of .opennet Technologies Course - I • We will NOT discuss the beat up client side (in Microsoft-Netscape battle won by MSFT) – Applets, Dynamic HTML and JavaScript (good ideas albeit a victims of battle) • Course could be useful even if you know Java – we will discuss topics like • Servlets – Simple way of building Java Server side applications • RMI – Foundation of pure Java distributed objects and systems built in these terms • JDBC (Java Database Connectivity) – Universal interface between Java and databases • Java Server Pages (how to build client software if you sell servers and don’t like MSFT) • Enterprise Javabeans: building blocks of middle tier software uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Overview of .opennet Technologies Course - II • We will start by discussing XML and some exemplar applications such as RDF SMILWSDL and SVG • We will discuss Webs and Grids • The four approaches to the Object Grid • CORBA from the Object Management Group • SOAP (Simple Object Access Protocol) from W3C – the pure web approach • RMI, Enterprise Javabeans (EJB) and Jini – the pure Java approach • COM from Microsoft • We will build on discussion of XML as a technology to show how it defines Objects and then how you make these objects useful by manipulating them with Java and accessing them with portals uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Distributed Objects • Anything with a digital signature is an object • Examples of current object technologies • Documents -- URL • "General Programs including database invocations" • Old Style Web -- CGI • New Style Web -- SOAP makes server side objects look like HTML tags as far as invocation goes • CORBA and COM -- special "interface definition language" (IDL) defines invocation in C++ like syntax • RMI uses Java language as IDL language • Benefits of distributed objects • allows objects written in different languages to communicate seamlessly via standardized messaging protocols embodied by middleware. • Higher levels of transparency of interoperability • Objects can be “managers” of resources like telescopes, satellites, computers, databases, medical devices …. • provides flexible grain of decomposition for building complex systems uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Today’s Distributed Object Web: The Confusing Multi-Technology Real World Middleware Server Layer W PD DC DC DC PC W T N D W O Clients Middle Layer (Server Tier) W is Web Server PD Parallel Database DC Distributed Computer PC Parallel Computer O Object Broker N Network Server e.g. Netsolve T Collaboratory Server Third Backend Tier uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Multi-Tier Client Server Service Relational Database Object Store Back-end Tier Services Middle Tier Servers Client Tier Object Broker IIOP HTTP Web Server RMI(IIOP)or Custom Specialized Java Server Old and New Useful Backend Systems Javabean Enterprise Javabean uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Distributed Object Web Approach • Need to use mix of approaches -- choosing what is good and what will last • For example develop Web-based databases with Java objects using standard JDBC (Java Database Connectivity) interfaces • Oracle, DB2, Informix, Sybase, Lotus Notes, Object database choice becomes an issue of performance/robustness NOT functionality • Use CORBA (C++) or Java as software to wrap existing applications with XML as syntax to define these distributed objects • Note Middle tier insulates client from backend -- can use one object model for user level (object functionality) and different one for backend (object access and persistent store) • specialized object databases getting “overwhelmed” by multi-tier approach with Oracle etc. traditional backends Write Software in Java but define data and interfaces in XML uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
3-Tier Architecture and Different Object Models ObjectRepository Database • There are several important Object Models: COM, CORBA, Java, Web, Oracle Database …… • But it doesn’t matter!! XMLFile System(Web Site) Request Or Export/Import Information Middle Tier“Business Logic”dissociatesUser and Back End uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Emerging Object Grid Service Model Back End Servers and their services Clients andtheir servers Middle Tier Services hostedon Web servers uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
GEM Portal Architecture Geophysical “Web” Info Seismic Sensors Field Data General “Web” Info Databases (HPCC) Computers (Java) Interactive Analysis Client Visualization Backend Services Middleware Bunch of Web Servers and Object Brokers Collaboration SecurityLookup Registration Agents/Brokers Application Integration Visualization Server Seamless Access Clients uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Computational Science Portal: The Computing Service MultidisciplinaryControl (WebFlow) Portal Control Parallel DBProxy Database NEOS ControlOptimization OptimizationService Origin 2000Proxy MPP NetSolveLinear Alg.Server Matrix Solver Agent-basedChoice ofCompute Engine IBM SP2Proxy Data AnalysisServer MPP uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Services in Computing Portals • Security • Fault Tolerance • Object Lookup and Registration • Object Persistence and Database support (as in EIP’s) • Eventand Transaction Services • Collaborationamong scientists around world • Job Status as in HotPage (NPACI) and myGrid (NCSA) • File Services (as in NPACI Storage Resource Broker) • Support (XML based) computational science specific metadata like MathML, XSIL • Visualization • Programming • Application Integration (chaining services viewed as backend compute filters) • “Seamless Access” and integration of resources between different users/application domains • Parameter Specification Service (get data from Web form into Fortran program wrapped as backend object) AnyPortal uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
What is a Web Client I? • Originally we thought of Web Systems as a set of communicating objects with • Not much on client linking to UNIX processes invoked by CGI • Then we excitedly got balanced client server applications with JavaScript and Java applets on client which was faster as no network traffic for “small” local actions • Servlets, Enterprise Javabeans and CORBA provided robust middle tier programming model • But browsers never became a good programming environment as actions (say of JavaScript) undefined or quality (of Java virtual machine in browser) poor. • So browsers are just display technology and one should use servers or applications for software • HTML SVG XHTML WML are used to define what client is to display uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
What is a Web Client II? • Gilders law of the Telecosm (September 2000, Free Press; ISBN: 0684809303, #3557 in Amazon Sales)says network bandwidth is improving 3 times faster than CPU performance • One can make dynamic clients with either client side JavaScript (or equivalent) or with server side Java Server Pages (JSP) • JSP provides similar functionality to Java Applets with Java running outside browser in a nice robust server • This is the old way we built applications done with faster networks and more elegant implementation (we used to invoke Perl CGI scripts to provide dynamic web pages but this was too slow) • Gilder’s law supports JSP approach uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Palm Tops help define Client Model • There is growing interest in wireless portable displays in the confluence of cell phone and personal digital assistant markets • By 2005, 60 million internet ready cell phones sold each year • 65% of all Broadband Internet accesses via non desktop applicances • One needs to design web systems so they can be accessed from either a PDA or a PC or a Powerwall • This implies that only code in browser should be that immediately needed to relay events between user and web system – all “logic” (state) should be outside browser. uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Web Technologies in a Nutshell -- Java • Java -- Objected Oriented version of C/C++ supporting Interactive Distributed Computing. • Original Web architecture (e.g. CGI) was server-side. Java allowed design and Implementation of balanced Client Server Applications but this original motivation is less important now • Java likely to be a dominant software engineering and Scientific Computing language -- see http://www.javagrande.org • This course discusses Java as a language in context of a system building tool • Java will probably be preferred language for development of next generation general or custom Web servers and clients • Programmers more productive in Java • Java has frameworks (libraries) for key Internet functionalities • Java can build client side customized GUI's and graphics/image processing but Microsoft JavaScript and DHTML competes here and MOST Industry use of Java is in middle tier • New Java 2 has several enhancements including very many specialized API’s • Javabeans are (visual) component model for Java applications • Enterprise Javabeans are Java middleware containers • Jini and RMI allow distributed objects to be found and communicate uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Web Technologies in a Nutshell - JavaScript • JavaScript -- only superficially related to Java and was called LiveScript -- is Netscape's (somewhat supported by Microsoft) fully interpreted Client side extension of HTML. This is a good Client Window integration /customization technology where flexibility more important than performance • i.e. use JavaScript for Rapid Prototyping of Complex User Interfaces • First examples use JavaScript together with frames ( HTML extension) for interactive multi-window technologies • JavaScript is roughly equivalent to "Abstract Windowing Toolkit/ Layout Manager" in Java but applied to Browser Frames and not Java windows • JavaScript cannot build complex filters or simulations as slow • But JavaScript with dynamic HTML is powerful client technology which is often easier and faster than Java -- it is faster as invokes optimized browser functions • both Internet Explorer 4 and Netscape have excellent JavaScript support • Server side version of JavaScript called LiveWireruns on Netscape Servers -- unsuccessful • Originally expected client side use of JavaScript to grow in importance but new view of Web clients limits use of JavaScript to small critical event handling • JavaScript on Palmtops called WMLScript uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Web Technologies in a Nutshell - DHTML • There is an emerging DOM or Document Object Model which will be uniform model used by W3C, Netscape, Microsoft • It allow you to address individual components of a page e.g. text box, image or collections thereof as separate entities • DOM is quite close to IE 5 conventions and is based on XML • DOM ought to be critical for publishing industry – Microsoft Word does not use except implicitly in Web export • Cascading Style Sheets allow one more powerful ways of assigning properties (such as color fonts etc.) to these components using either name(id) or type (<h2> tag etc.) • DHTML or dynamic HTML allows one to address the components of document and change on the fly (without reloading page) the properties of these components • This includes not only natural style properties but also position, size and “visibility” • DHTML currently handicapped by major differences between IE5 and Netscape 4 -- functionalities are similar but syntax very different • JavaScript combined with DHTML allows animations, graphs and replacement of just parts of text uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Web Technologies in a Nutshell - XML • HTML is powerful but does not separate display and form (structure of document component as an object) • XML is a generalization of HTML which allows definition of arbitrary tags • e.g. <student name=“Jane Doe” class=“CSIT:IT1” grade=“…” >Working Hard</student> is more elegant way of capturing information in a reliable fashion than HTML • <h2>Students</h2><ul><li>Jane Doe: Working Hard</li><ul> <li>Class: IT1</li> <li>Grade: …</li> …. </ul></ul> with a PERL program to extract data • XML allows powerful way of defining dynamic ascii databases useful for “modest size data” such as people, document citations etc. • XML parsers map XML tags into HTML for display or hand to programs to interpret • XML can also be used to define extensions to HTML such as special tags for mathematics (MathML) or chemistry (CML) or ….. • XML defines syntax for “serializing” Web objects and transmitting between clients and servers SOAP uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Example from Special Edition Using XML • HTML Version of Sales Sheet • <dl> • <!-- Fruit --> <dt>Apples</dt> • <!-- Price --> <dd> $1</dd> • <!-- Fruit --> <dt> Oranges </dt> • <!-- Price --> <dd> $2 </dd> • </dl> • XML Version of Price list • <FruitPriceList> • <fruit><fruitname>Apples</fruitname> • <Price> $1</Price> </fruit> • <fruit><fruitname>Oranges</fruitname> • <Price> $2</Price> </fruit> • </FruitPriceList> uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Example from Special Edition Using XML • <bottle> • <top> type 3 childsafe </top> • <body><body-type> 100 count plastic </body-type> • <contents> <count> 100 </count> • <content-type> aspirin </content-type> • </contents></body> • <labeling> • <frontlabel> XYZ brand generic </frontlabel> • <rearlabel> XYZ directions and warning </rearlabel> • </labeling> • </bottle> uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
XML Topics • Syntax and Examples • Types of Tools Available • How to define well formed and Validated XML – DTDNamespaces and Schema • Events in XML and HTML: JavaScript DHTML • XSL and CSS Style sheets including XPATH (how to specify location in XML document) • Parsing XML from Java and .. (SAX and DOM) • XLINK and XPOINTER – XML hyperlinks • Applications of XML: XHTML RDF WSDL SMIL SVG Dublin Core • Mapping XML to Java: Castor uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Example XML Software Application Descriptor <?xml version=“1.0”?> <!DOCTYPE application SYSTEM “ApplDescV2.dtd”> <application id=“disloc”> <target id=“osprey4.npac.syr.edu”> <status installed=“Yes”/> <installed> <CmdLine command=“/npac/home/webflow/GEM/JAY/dis2loc” /> <input> <inFile Path=“/npac/home/webflow/GEM/JAY/” Name=“disloc.output”/> <source Host=“osprey4.npac.syr.edu” Path=“/npac/home/Jigsaw/WWW/tmp” Name=“disloc.out”/ > </input> <output> <outFile Path=“/npac/home/webflow/GEM/JAY/” Name=“simplex.input” /> <dest Host=“osprey4.npac.syr.edu” Path=“/npac/home/webflow/GEM/JAY/simplex/” Name=“s.in” /> </output> <stdout Host=“aga.npac.syr.edu” Path=“/npac/home/haupt/webflow/history/” Name=“job2001.out” > <stderr Host=“aga.npac.syr.edu” Path=“/tmp/” Name=“haupt_job2001.err” > </installed> </target> </application> uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
XML Applications • XHTML: HTML “done correctly” in stricter XML Syntax • SMIL: Syntax to specify multimedia data including timing of “parallel” and “sequential” displays • MathML: Syntax to specify either content or presentation of Mathematics (TeX in XML) • SVG: 2D graphics (compare Java2D) • RDF: Specify Information resources • WML: Specify how to transmit information to Cell Phones or PDA’s • WSDL: Define Grid Services so they can be accessed uniformly • CML: Specify chemistry (e.g. molecules) • XSIL: Specify Scientific data • For instance this can be used as basis of X”weather” to specify data from sensors; X”seismic” for data from Seismic sensors etc. uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
The first Homework • Read Chapter 1 of Holzner’s book Inside XML • Browse http://www.w3.org/XML/ • Think about this question: • You are a librarian and wish to convert to an XML based card catalog and keyword/metadata system. • Your library has books but also a collection of 13th century art and a collection of MP3 audio files. • This is a set of tags and attributes such as the following example • <artifact type=”mp3” sku=”1110.223.334”> <author>Beatles</author> <daterecorded status=”unknown” /> ... </artifact> • Discuss and suggest further elements and attributes uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Basic Architecture Database (Virtual) XML Layer Enterprise Javabeans Java Servlet • This is .opennet structure Persistent Managed Store Object layer Virtual Machine Control Form Output Page viewed by user uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Web Technologies in a Nutshell - PERL • PERL is a C like Interpreter with powerful direct access to UNIX system commands and very easy ways of processing text files • PERL is a relatively old technology which has being overtaken by Java tidal wave. • Still PERL has significantly better Systems and Document handling capability than Java • Very good for UNIX as much easier than Shell for system scripts -- PC versions exist but not so well integrated into O/S • Wonderful regular expression handling • PERL is traditional but not best choice for server CGI extensions and development of filters even for simpler cases involving text documents • PERL5 is object oriented but much less elegant (in my opinion) than Java • PERL5 has very useful multidimensional associative and regular arrays • Use PERL for UNIX batch jobs to edit text files (e.g. map www.npac.syr.edu to aspen.csit.fsu.edu)and quick simple Web server extensions – Convert latter to Java for production uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Java Message Service JMS In a Nutshell • Supports MOM – Message Oriented Middleware supporting either • Point to point: One system sends a message to another system • Publish/Subscribe: There is a server with “labeled (by topic) queues” • A given queue could contain all messages on “Korean Recipes” • A provider sends messages to appropriate queue • Any number of subscribers register interest in topics with possible sophisticated “selectors” • When a relevant message is generated for a given topic, all subscribers are sent this message. They can do what they like with it uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Web Technologies in a Nutshell - Databases • The Web provides a convenient integration environment for "mature" technologies migrating from existing computer environments. • Object Relational databases are a good example where it is now straightforward in Microsoft Access, Oracle, DB2, Informix, Sybase etc. to provide a Web Interface to access and edit database with Java/JavaScript/Forms based Interfaces • Object databases such as Illustra also interfaced to Web but this is wrong way to thing about problem • Systems such as Cold Fusion and Dreamweaver provide convenient high level interfaces to Web-linked databases • Note Web Authoring “confusion” is another result of unfortunate browser war lost by Netscape • Several excellent Java to Database packages becoming available with the JDBC standard based on ODBC -- more powerful but lower level than systems like Cold Fusion • CORBA will have good Web and Java Interfaces and in IT2 we will discuss integration of Web CORBA and database technologies • CORBA views a database as a managed persistent object uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"
Web Technologies in a Nutshell – VRML/SVG • VRML plays same role to 3D worlds that HTML does to documents • VRML 1.0 has been widely available and specifies static 3D scenes through which you can navigate. Already provides universal visualization environment and we have examples of use In Geographical Information Systems • Note can embed clickable URL's as with ImageMaps which can be used to annotate images to provide interactive resources • VRML 2.0 is now the standard with critical enhancements so that individual elements of 3D world are dynamic and can be programmed • It is designed to support full interactivity (televirtuality) with texture mapped video, avatars etc. • VRML 2.0 could require huge computing resources whether used as the virtual car-dealership / interactivity gaming or more academic uses such as collaboration between teachers and students in 3D virtual classroom • Bandwidth and computing needs of VRML are handicapping acceptance and appears that VRML will NOT “make it” -- replacement unclear • Microsoft ChromeEffects (XML based) and • Java3D address some but not all VRML applications • X3D is XML syntax for VRML • SVG is XML for Vector Graphics Primitives (much more limited but perhaps more realistic than VRML) uri="gxos://ptliu/communitygrid/courses/it1" title="OpenNetTechnologies"