360 likes | 451 Views
Developing Course Management Systems using the Semantic Web. Chapter 8. Evolution of Internet Computing. scale. Parallel HPC. Semantic discovery. ??????. Automate (discovery). Discover (intelligence). Transact. Integrate. Interact. Inform. Publish. time.
E N D
Developing Course Management Systems using the Semantic Web Chapter 8
Evolution of Internet Computing scale Parallel HPC Semantic discovery ?????? Automate (discovery) Discover (intelligence) Transact Integrate Interact Inform Publish time
Introduction • Next generation web (Web 2.0) will be based on the semantic web • Success of the web 2.0 vision is dependent on development of practical and useful semantic web-based applications. • Semantic web: • We have • Semantic technologies (OWL, RDF, SPARQL, RDQL) • Language to represent knowledge • Language to query knowledge bases • Language to describe business rules • We need application models and applications in commercial settings
Topics for discussion • Course management system based on semantic web S-CMS • Well known CMSs is Blackboard.com (ublearns) and WebCT.com • S-CMS provides • Information and solution management for students and faculty • Automate different procedures such enroll or register for classes
Conn. layer connectivity layer Source layer Html feeds Other data source formats University database XML feeds S-CMS Architecture
RUD (OWL schema) Instance generator RUD/SUD warehouse SUD (OWL Schema) OWL instances Conn. layer connectivity layer Source layer Html feeds Other data source formats University database XML feeds S-CMS Architecture Instance layer
Semantic query engine Inference layer Rules Repository (SWRL) Rules Engine (Bossom) Query layer RQL RDQL Buchingae RUD (OWL schema) Instance generator RUD/SUD warehouse SUD (OWL Schema) OWL instances Conn. layer connectivity layer Source layer Html feeds Other data source formats University database XML feeds S-CMS Architecture Instance layer
teacher Course management Report generator Rules Editor Query Editor student Dynamic Web site Semantic query engine Inference layer Rules Repository (SWRL) Rules Engine (Bossom) Query layer RQL RDQL Buchingae RUD (OWL schema) Instance generator RUD/SUD warehouse SUD (OWL Schema) OWL instances Conn. layer connectivity layer Source layer Html feeds Other data source formats University database XML feeds S-CMS Architecture Application layer Instance layer
teacher Course management Report generator Rules Editor Query Editor student Dynamic Web site Semantic query engine Inference layer Rules Repository (SWRL) Rules Engine (Bossom) Query layer RQL RDQL Buchingae RUD (OWL schema) Instance generator RUD/SUD warehouse SUD (OWL Schema) OWL instances Conn. layer connectivity layer Source layer Html feeds Other data source formats University database XML feeds S-CMS Architecture Admin Application layer Instance layer
S-CMS Architectural layers • Seven distinct layers: Source, connection, instance, query, inference, application, presentation
The Source Layer • Responsibility: Data and information needed for course management • Data stored in RDBMS • Faculty, courses, students, degrees, enrollment relations, personal information about students and teachers. • Draw a relational model to represent details of this layer.
The Source Layer (contd.) • HTML data: An Eclipse plug-in was built to handle all the information available on a web page in HTML format (Suggestion: request sites to provide RSS feeds) • Databases: many university-wide databases • Data integration: directly into the application (this is expensive time and skill-wise, not extensible, fragile, new sources means new code) • Semantic layer: use of semantics and ontologies makes the integration process automatic • Scale of the project: This particular case study used 200 tables and 600 views and the number of students in the range of 13000.
The Connection layer • Responsibility: connecting to various data source using a variety of protocols • SQL connectors, XML connectors, Web connectors • In other applications: • Web services enabled data (SOAP/REST over HTTP) • RSS feed (XML over http) • Pure text • Legacy databases • Pure HTML from public sites
The Connection Layer (Contd.) • Maintains a pools of connections to several data sources • Connections to relations data sources and HTML web sites(!) : HTML Eclipse plugin was developed. public class CMSServerImpl extends UnicastRemoteObject implements CMSServer …..// a method in this class URL url = new URL( "http://www.buffalo.edu/etc.html" ); Reader in = new Reader( new InputStreamReader( url.openStream() ) ); inputLine = in.readLine(); // parse and extract data from this inputLine and loop
The Connection Layer (contd.) • Major difficulty: Obtain a copy of the University database with real data from the administration. • This needs authorization that takes about 3 months. • Database with sensitive fields (PIN and phone numbers) were blacked out. Had to work with pseudo data.
The Instance Layer • Responsibility: for managing semantic web information such as ontologies describing university domain information such as course, students, projects and teachers. • It is also in charge of transformation of relational data into ontological instances. • This layer creates knowledge-base that will be used by the upper layers.
The Instance Layer • Major challenge: need to query across multiple heterogeneous, autonomous and distributed (HAD) university data produced by multiple organizational units. • Different types of data, varying formats, different meanings, referenced using different names integration challenges • Four types of heterogeneity were identified:
Heterogeneities Identified • System heterogeneity: Application reside on different hardware and operating system platforms • Syntactic heterogeneity: Information sources use different representations and encodings • Structural heterogeneity: Different document layouts, formats, and schema. • Semantic heterogeneity: intended meanings are different • Solution for seamless integration: use of semantic integration • Global scheme was used to address structural and syntactic heterogeneity. • Ontologies was used to overcome semantic heterogeneity • Ontologies can be used in communication between humans, among software systems, and to improve design and quality of the software systems.
Ontology Creation • OWL was the chosen language • The development of an ontology-driven application starts with creation of an ontology schema • The ontology schemas in this application contain the definition of various classes, attributes and relationships • Protégé was selected as ontology editor
Ontology creation (contd.) • OWL documents: University Resource Descriptor (RUD) and Student University Descriptor (SUD) • Inference over OWL documents needed to answer questions such as • Who are the teachers and students? • What courses are offered by a department? • Which courses are assigned for a specific teacher? • For which courses a student is enrolled? • Which projects are assigned to a course? • What are the students’ grades in a course?
RUD: University Resource Descriptor ontology • All the information mentioned earlier is represented in OWL. • It has also has hundreds of relationships between concepts. • Relationships will be used in inference layer to infer new knowledge.
SUD: Student University Descriptor Ontology • SUD is a resource that describes a university student. • Each student has SUD. • Name, id, courses taken, degree, telephone number, age, etc. • SUD is described using OWL as shown in the next slide
SUD OWL definition … <owl:Class rdf:ID:”Student”> <rdfs:subClassOf> <owl:Restriction> <owl:cardinality rdf:datatype=http://www.w3.org/2001/XMLSchema#int”>1 </owl:cardinality> <owl:onProperty> <owl:DatatypeProperty ref:ID=”AverageScore”/> </owl:onProperty> </owl:Restriction> …
Ontology Population <student rdf:ID=“LeeHall11204199”> (…) <StudentID rdf:datatype= http://www.w3.org/2001/XMLSchema#nonNegativeInteger>2041999 </StudentID> <StudentName>Lee Hall</StudentName> <Degree>Computer Science</Degree> <StudentEmail>lhall@mail.edu</StudentEmail> <Studies> <Subject rdf:ID=“Service_Oriented_Systems”> <SubjectName>Service Oriented Systems</SubjectName> (...) </Subject> </Studies> (…) </Student> This is a SUD instance. RUD and SUD instances are automatically created. The book list a list of difficulty they faced when creating this ontology.
The Query Layer • Responsibility: allow querying of the knowledge-base • Query layer provides an interface to the knowledge-base formed by all the SUD and RUD ontology instances that were automatically generated. • This interface understands four languages: • RDF query language (RQL) • RDF data query language (RDQL) • Buchingae • SPARQL
Sample queries (RDQL & Buchingae) • Buchingae is a simple web-oriented rule language. By "web-oriented", we mean that you can directly refer to URI resources such as web ontology elements or RDF resources when writing rules. Buchingae is a language appropriate for specifying production rules. • RDQL is a query language that treats RDF as pure data.
Sample queries (contd.) SELECT ?X, ?C, ?Z WHERE (?X <http://apus.uma.pt/RUD.owl#HasGPA> ?Y), (?X <http://apus.uma.pt/RUD.owl#Studies> ?C) (?Y <http://apus.uma.pt/RUD.owl#Value> ?Z) and Z>3.0 // this a RDQL query for selecting all student with GPA>3.0 Query qu is p:Studies(?st, ?course) and p:Teaches(?prof, ?course); Can you guess what this query does? These queries are NOT meant to be designed by end users of the system. Sys admin creates a set of queries and makes them available thru’ user-friendly Interface.
The Inference Layer • Responsibility: allows carrying out inference using semantic rules on the knowledge-base. • For example: Infer if the changes in the program was effective. (Of course, effectiveness should have been defined using the rules.)
The Inference Layer (contd.) • A rule management system was implemented to extract and isolate course management logic from procedural code. • Since rules for enrollment may change often, it is not a good practice to embed this in the source code. • This option of detaching enrollment rules from application logic gives administrators an effective way to create rule base and to change the rules.
The Inference Layer (contd.) • The rules are defined using SWRL (Semantic Web Rule Language) • They correspond to axioms about classes or the instances stored in the ontology warehouse (knowledge-base). • By applying rules we can infer new facts.
SWRL query sample rulebase rb01 { (….) rule R01 is if classTaken (?x, RUD:CS6100) and classTaken (?x, RUD:CS6550) then eligible (?x, RUD:CS8050) }
The Application layer • Responsibility: Interface to the Teachers and Students • Teachers can create projects associated with the course, and define semantic rules for enrollment (say, prerequisites). • Students can interact with the courses and projects.
Application layer (contd.) • The whole project was implement using Eclipse IDE (can be easily done using Netbeans IDE) • S-CMS Manager: This is used by teacher, for viewing courses, adding rules related to the courses, add project and rules related to projects, etc. • Dynamic Web enrollment: This is for students to enroll in courses. • Report Generator: for teachers to list a class or grades etc. • Grading ontology and its plug-in: Grading feature is a plug-in that allows a teacher to define grading policy.
Evaluation • S-CMS was benchmarked in an university environment for a specific department. • Server with SQL 2000 and a client machine running S-CMS. • Intel P4, 3.0Ghz, 512MB, 40Gb disks, MS windowsXP, connectivity 100Mb Lan • 13000 ontological instances • Scaled well for the student population • Start up loading of data took about 7 min 32 sec
Summary • Development of semantic web has the potential to revolutionize the WWW • S-CMS discussed was fully based semantic technologies. • Easily adaptable and extensible • Wave of the future, if not the current trend.