1 / 55

Department of Computer Science and Engineering

A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education over Publish/subscribe P2P Overlay. Department of Computer Science and Engineering

dillian
Download Presentation

Department of Computer Science and Engineering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Framework for Community based Distributed and Semantically Annotated Course-ware Development, Sharing and Quality Management for Higher Technical education over Publish/subscribe P2P Overlay Department of Computer Science and Engineering Motilal Nehru National Institute of Technology Allahabad and Applied Artificial Intelligence Group Centre for Development of Advanced Computing, Pune

  2. Higher Technical Education: Observations • Engineering Institutions : 2,500 approx • Annual output: 400,000 approx • Computer Science graduates : 300,000 approx • Growth rate: 20% expected (NASSCOM) • Employable Output: 25% only (McKinsey Global)

  3. Higher Technical Education: Observations • M.Tech. Output: 20,000 • Ph.D. Output • Engineering: less than 1000 • Basic Sciences: around 5,000.

  4. Higher Technical Education: Observations Number of researchers (2007-08) • India :About 154,800 • China: 1,423,000 • US : 1,571,000

  5. Higher Technical Education: Observations Needs: Order of magnitude growth of Quantity and Quality • Rapid and large scale growth of • Student enrollment • Institutes/universities • Research Scholars • Total quality management of • Outputs: Publications, Patents, Personals • Resources: Courseware, Training material, Labs and Evaluation • Services

  6. Impact of Internet • Highly scalable, anywhere/anytime access • Very large volume of: • Courseware • Research papers • Training materials • No positive impact on quality of education. • Points to a disconnect between needs and availability

  7. Possible Reasons for Disconnects • Resources are targeted to a specific groups • May not be suitable for academically, linguistically and culturally different groups of users • Disproportionately larger effort required to search • Lack of semantic annotation • Lack of quality assessment and indicators

  8. Learning Methodologies • Traditional class room teaching with/without ICT • Face to face interaction with teacher and peers • Valuable learning experience • Peer interaction dominant • E-learning: • Unsupervised: No interaction, Learners work in isolation • Supervised: Limited interaction • Static resources: • Very limited support for evolving heterogeneous needs of learners.

  9. E-Learning Infrastructure • Content Delivery : Client/Server Mode • Dedicated Servers in LAN Environment • Through Portals on WWW • Communication Paradigm • Request/Reply • Synchronous • Coupled • Scalability: Limited • Fault Tolerance: Limited

  10. Latent Knowledge Resources • Every institution has large number of hosts. • Each host contains valuable knowledge resources. • Latent: search engine can’t list them • Reason: • Hosts do not have Public IP address • Hosts are not servers • Hidden behind Proxy/NAT

  11. Sharing Latent Knowledge Resources • Interest based cooperative sharing is desirable • Difficulties: • Heterogeneity of interest • Dynamic interest evolution • Rendezvous of availability and interest • Hosts are widely distributed

  12. Sharing Latent Knowledge Resources • Visibility of interests and contents • resource owner – declare the availability and • Interested user -- submit there interest • Dynamic evolution of Interest based communities

  13. Our Vision • Decentralized and autonomous middleware • Highly Scalable • Fault-tolerant • Minimal management and maintenance overhead • Support dynamic evolution of interest based communities for • Collaborative generation of: • Content • Meta-data • Domain ontology • Seamless sharing of resources • Peer interaction

  14. Our Vision • Semantic searching based on • Meta-data • Domain ontology • Quality assessment of resources by community • Behavioral Mining

  15. Challenges • Heterogeneity • Users: Interest and content • Host: uptime, memory, CPU, bandwidth • Scalability and interoperability • Hosts without Public IP • Management of dynamics • content, user group and their behaviors • Absence of domain ontology and meta-data

  16. Requirements • Communication paradigm to support scalability • Decoupling: Time, space and synchronization • Anonymity • Network Infrastructure to support • Peer-to-peer interaction • Dynamic evolution of interest based communities • Interoperability • Seamless dynamic leaving and joining of nodes

  17. Decoupling : • Between providers and consumers • Increase scalability • No dependencies • No coordination & synchronization. • Create highly dynamic, decentralized systems

  18. Dimensions Of Decoupling: • Three dimensions • Space - No need to hold references or even know each other • Time - No need to be available at the same time • Synchronization (flow) - Control flow is not blocked by the interaction

  19. Publish/Subscribe • Paradigm for scalable distributed applications • Provides • Decoupling • Anonymity • Asynchrony

  20. Publish/Subscribe: High Level View

  21. Publish/subscribe: Subscription Model • Topic (subject) -based • Content-based • Type based

  22. Implementation of Event Service • Centralized Implementation • Event matching is easy • No Scalability • No fault Tolerance • Distributed Implementation • Set of nodes designated as Brokers • Improved Scalability and fault tolerance • Routing and matching of events is difficult

  23. Implementation of Event Service • Role based Implementation • Every node can take any role based on context • Broker • Publisher • Subscriber • Highly scalable and fault tolerant

  24. Role based Implementation: Challenges • Management of scalability and fault-tolerance • Application Layer Overlay Hierarchy • Informed/Un-informed leaving • Routing of Publications and subscriptions • Location of rendezvous • Life span

  25. Role based Implementation: Challenges • Role assignment • Designated (fix role) • Dynamic • Matching • Content based • Type based • Notification • Service Guarantee (at least once, at most once etc.)

  26. Current Network Infrastructure

  27. Current Network Infrastructure Within Institute/Organization: • Nodes are assigned Private IPs • Grouped in IP based subnets • Physically connected with each other through layer-2 and layer-3 switches. • Not visible to outside world • Connect to outside world through NAT/Proxy

  28. Our Network Architecture Within LAN of Institute/Organization • Nodes having same interest: • Not aware about each other • May be physically distant • Some virtualization is required • Formation of interest based virtual rings • Virtual links are formed using virtual (e.g.. TCP) links • Virtual ring termed as Overlay.

  29. Our Network Architecture With in LAN of Institute/Organization

  30. Our Network Architecture Node visibility • Nodes hidden behind Proxy/NAT • Virtual rings of same interest may be behind different proxy/NAT • Isolated rings • Resource sharing not possible: Invisibility • Have to come under one umbrella

  31. Our Network Architecture • Virtual Ring of Proxies too. • This makes it a 2-tier Overlay

  32. Our Network Architecture Dynamic Community Evolution • Abstraction over the 2-tier overlay • Isolated rings form communities • Virtual Interest based proximity: Physically nodes may be far apart

  33. Our Overall Network Architecture

  34. Pub/Sub on our Network Architecture: • Every Node acts as: • Publisher, Subscriber, Broker • Rendezvous Point based Matching • Distributed Hash Table (DHT) • Nodes: • Majority are short lived and have minimal capabilities • Small percentage • Remains up for long periods • Relatively better storage, bandwidth and memory • Termed as Super nodes.

  35. Super Nodes • Candidate Super Nodes: • May get elected dynamically • Proxy Nodes • GARUDA nodes/ NKN nodes • May act as Brokers for • Popular content (temporal locality) • Hot contents are automatically cached

  36. Finding Content • Push/Pull Model • Subscription Instead of Searching • Learner need not make search effort • Learner subscribes for content • System provides matching Publication

  37. Finding Content • Semantic Support • Publication with/without meta-data • Subscription with/without meta-data • Knowledge Resources enriched with meta-data • Use of domain specific ontology

  38. Meta-data • Meta-data can be created in distributed manner by: • Content creator • Some designated meta-data expert from the community • Automatic or semi-automatic • Meta-data: Published/subscribed, stored, retrieved as usual knowledge resource.

  39. Ontology • Distributed Ontology creation by • Some experts from community • Published/subscribed, stored, retrieved as usual knowledge resource.

  40. Our Universal Client • Every node will run a generic client application • Universal client provides an interface for: • Joining, Leaving: virtual ring maintenance • Fault tolerance: replication, caching • Publishing, Subscribing content • Event Brokering • Meta-data creation • Ontology creation • Behavior mining and Quality assessment

  41. Our Software Architecture

  42. Layer 1: Distributed and Federated Database It Contains: • Meta-data base • Ontology base • Knowledge Resource base • Access log • Base for user profiles

  43. Layer 1: Distributed and Federated Database It also contains: • Publication base • Subscription base • Base for event brokering

  44. Layer 2: Publish/Subscribe, Overlay Layer It has three sub-layers: • Sub-layer 1 : Overlay sub-layer • Sub-layer 2 : Community Management sub-layer • Sub-layer 3 : Publish/Subscribe sub-layer

  45. Layer 3: Service Layer Provides Services for • Distributed Ontology Creation • Metadata Harvesting • Inference Engine • Multilingual Subscription/Publication Support

  46. An Example Demonstration • Layer 3 of our Software Architecture • Presentation by C-DAC

  47. Design Challenges and Trade-offs • Overlay Architecture: Structured/Unstructured/Hybrid • Unstructured • Stateless, Maintenance cost minimum • Flooding instead of routing, bandwidth wastage • Structured • State full, Maintenance required • No flooding, saves bandwidth

  48. Design Challenges and Trade-offs • Implementation of event service • Purely Distributed • Every node can be broker • High scalability • Higher cost of event management, routing and matching • Partially Distributed • Only Proxies as brokers • Scalability is reduced • Lower cost of event management, routing and matching

  49. Simulation • To evaluate design alternatives: • Role: • Assignment Vs acquisition • Static Vs Dynamic • Utilization of Skewedness in subscription • Replication of Hot Content • Service Guarantee • Life span of Knowledge resources • Informed and Uninformed Leaving

  50. Strengths: MNNIT • Implicit Invocation Systems and Semantic Web • Group of faculty members and research scholars (PhD, MTech) indulged in: • Large scale Publish/Subscribe for dynamic topologies • Automatic meta-data extraction and generation. • Networking and Distributed Computing • Group of faculty members and research scholars (PhD, MTech) indulged in: • Peer-to-Peer computing • Cloud Computing

More Related