440 likes | 604 Views
NUWeb, Net User’s Web, A New Web System. sw@gais.cs.ccu.edu 中正資工 GAIS 實驗室 新典資訊 八通資訊 2008.11.05. WWW Architecture. Web Server (e.g., Apache, IIS) Browser (e.g., IE, Firefox) Addressing and Information Channel (DNS, URL, SearchEngine) Abstract Model:
E N D
NUWeb, Net User’s Web, A New Web System sw@gais.cs.ccu.edu 中正資工GAIS實驗室 新典資訊 八通資訊 2008.11.05
WWW Architecture • Web Server (e.g., Apache, IIS) • Browser (e.g., IE, Firefox) • Addressing and Information Channel (DNS, URL, SearchEngine) • Abstract Model: • Provider (server), Consumer (client), Channel • Client-Server architecture, Centralized Service
Web servers • Web servers are the foundation of the Web space. • Web services are provided through the Web server architecture. • Web servers are the platform for providing information sharing, web community, and all kinds of web services.
Interestingly • Has WWW solved the problem Bernards Lee intended to solve at the first place? • Inside an organization, corporate, community, or Intranet, • can we search/browse for what we want easily and effectively like we do in the Internet? • Can we share conveniently like what we do in the Web?
Knowledge Management • Has been a hot term for long • There have been many KM products in the market for many years • But, very few companies, if any, have been successful in knowledge management!
Enterprise Information Portal • Has been a hot term for many years • But, has not been successful yet!
Ironically • Although we have great experience in information browsing, searching, sharing, and community in the Internet space (WAN), we don’t have such experience in our Intranet or LAN space. • While Internet is such a prosperous information wonderland, IntraNet was still a desert. • Why?
Nightmares for corporate running • A top manager such as COO, SalesVP, or EngineerVP leaves the company. • Some critical documents, knowledge, relationship are gone also. • A real example, a COO left and the returned PC is with empty disk! • MIS headache
Key factors • There is no Web Space in IntraNet. • Is the system easy to use? • Do the managers own web power? Can they do the management through a Web platform?
知識管理 • 內涵: • 知識輸入、編輯、組織整理、分享、搜尋 • 知識型式: • 文章、目錄、表格、連結、簡報、多媒體 • In practice,多數個人知識管理主要透過 Windows,plus some web services • File Folder management • Office tools to edit and organize • Bookmarks • Calendar , address book
如果 • 在 企業 或 機關內部的電腦都變成 Web server 可以提供資訊分享與互動服務 • 每個資訊/知識擁有者都有好的平台來管理資訊/知識,並把資訊/知識 分享出來
Knowledge Web (KnoWeb) • 知識 • 知識的本體內容 • 知識的關連 • 知識的架構 • 知識的社群、互動、評量
Ook, Ooki, Ookon • Ook: Orchestrated Open Knowledge, the unit of knowledge in the knowledge web (KnoWeb) . • Ooki: Ook Editor • Ookon: Orchestrated Open Knowledge On Net. A knowledge Base and Search Engine.
數位人生、數位家庭 • Is it a trend? • 個人、家庭會有數位資產? • 如何保存、管理、分享、傳承? • 未來時代,多數家庭都有個伺服器?A digital home server? • 未來時代多數個人都有個網站,他/她的PC/NB 都有個 Web OS ?
Problems of the WWW due to the fundamental design • Naming/Addressing problem: • Physical naming/addressing • Static Binding through DNS • URL may not be a good design, (hard-to-remember) • DNS could be slow • Information flow organization not designed in the first place, • Hotspot bottleneck problem, bandwidth waste problem, • Cache and Proxy tech are added separately afterwards, • Linkrot problem • Dead links, wrong links, faked links, • Approximately up to 15% of links • Need static IP, need to apply for URL, need knowledge in building up and managing Websites • Creating and maintaining a website is costly • Webpage creation is not easy • Divide the computer world into two hierarchies • Server: Website owners, service providers • Client: ordinary users
Weaving the Web(quoted from wikipedia) • In Berners-Lee's book, Weaving the Web, several recurring themes are apparent: • It is just as important to be able to edit the Web as browse it. Wikis are a step in this direction, although Berners-Lee considers them merely a shadow of the WYSIWYG functionality of his first browser. • Computers can be used for background tasks that enable humans to work better in groups. • Every aspect of the Internet should function as a Web, rather than a hierarchy. Notable current exceptions are the Domain Name System and the domain naming rules managed by ICANN. • Computer scientists have a moral responsibility as well as a technical responsibility.
What Is NUWeb? • Marriage of WWW with P2P • Technologically: • NUWeb = WebServer + Browser + WNS + SearchEngine + Proxy/Cache + WebBuilder + Blog + CommunityEngine + KIM + P2P – URL – DNS and – Cost • Logically: • A New Web System for any net user to build his/her own web in an extremely easy-to-use way. • A platform for web-building, information sharing, information management, community, and service management • A platform for Webilization • A project to pursue Wemocracy
NUWeb Features(1) • Can set up one’s own website on one’s PC for free • Without the need to apply for URL and no need for static IP, • Can be set up in a few minutes, like setting up an instant messenger • No need of knowledge of web server administration • Content can be cached, accessible even if the PC is off line. • Can create web pages in an extremely easy way • Can share publicly like Youtube, flickr, blogger, etc., with a full-text search engine. • Can set up a blog on one’s own web site, with content cached in NUWeb space • Can share directly with friends through PC 2 PC connection without size limitation.
NUWeb Features(2) • Can manage the sharing directories easily regarding what to share and who to share! • Can set up a community for friends/relatives on one’s own PC. • A browser with information management function. • Can set up a portal for a community such as school • A decentralized portal which is a federation of collaborative regional portals and personal portals, while the center is a community center and search engine for the nuweb space. • A platform for sharing, searching, web service, community, and knowledge management
NUWeb Software Architecture • NUWeb system is composed of three subsystems • NUWeb.CC CyberCenter • WNS, (web name service), • Search engine, Cache • Commuity services, (Photo, Blog, Video…) • NUPedia, • NUWeb CP (Community Portal) • Community services, (Blog, Photo, Video…) • Search Engine service, • Proxy and Cache • NUWeb PP (Personal Portal) • NUWeb browser, PKM, • NUWeb server, • NUWeb personal portal/blog builder
How it works • Personal Web server on Windows platform • Auto indexing, thumbnail, • Auto page generation and run-time rendering • Auto caching, • Bundled with php/perl platform • Registration to WNS in the set up, • Site name, user-account, SiteKey, … • UPNP to handle firewall/NAT • Each time a client gets on line, send the current IP and name/key info to the WNS center. • The connection request to a personal site will first send the name of the site to the WNS to get the IP of the target site (dynamic binding)
Naming and Dynamic Addressing • A page is a textual web document. It contains UltraLinks or tags and the display of such page might instantiate the display of some other objects such as included images. • An object is either a richtext document such as pdf, msdoc, msppt, etc., a multimedia file, or any singular file that can be accessed in the web space. • A resource is either a page or an object • GRN, global resource naming • SiteUniqName#objectname[#class#type#location] • fixed IP is not necessary • ABN (AddressByName), ABI (AddressById), ABC(AddressByContent) • USI (UniversalSiteId),
NUWeb CyberCenter • GRI: Global Resource Index • A distributed index structure for objects/pages on the NuWeb space • Use hash data structure • Search engine • Collaborative proxy • Content enhancement • Info filtering, content switching • Relay casting • Hierarchical search • Collaborative cache (super cache) • P2P UMTP protocol
Site Initialization • When a new site is installed: • Register the following info • SiteUniqName, to be interacted by the center • Titles of the site (at most T bytes) • Abstract of the site (at most P bytes) • tags, (if inappropriate, such as infringing others right, will be abolished by the center) • Country/city/county, real world geography info • Profile of personal info • Residents : SUN.resident will identify a user • Decide which directories to be open to public • Decide which directories to be open to private connections • Decide whether to open caching of the public directory
Site Initialization • The server will build an index for the pages/objects that are covered in the site . The index for public and private areas are separated such that the privacy will be secured. • The index is on the name and signature level, plus the content of pages, the support for object content index such as ms-doc files pdf files will be optional • After the site is set up, the user will be asked to provide a list of friends to which the system will send invitation letters.
NUWeb PP • Service Manager (starting from search) • Grabing, Caching, • Personal Portal, blog, … • NUMail, p2p secure mail system • Share, file transfer • Information Watch-dog • Information filter • Information/Knowledge management • Relay casting, streaming, • Web site builder, page creator, • Knowledge management
NUWeb CP • A suite of programs for setting up Community Portal in NUWeb space • The proxy and caching nodes in the NUWeb cyberspace • Community services: • Mail, Blog, BBS, … • Web HD • Search engine, …
Searching • The search in the nuweb center includes: • Search pages/objects by name (WNS) • Page content search • * attributed search , for example, search for pages authored by Hamming • The indexer in each nusite will send the raw-index to the center, and the center will build an index . The raw-index is a record containing indexable texts for each page or object. A text extractor will be used to extract text from rich text documents such as MS-DOC/PPT documents. The upload of such raw index will get approval from the users first. • Before rendering the search result to the user, the searcher needs to check whether the result page/object exists at that moment. • It uses the SSN to check the SiteDB and to see whether that site is avalable. It also use grn to check where such resource is available in the cache.
Caching • Caching • Every site page will be automatically cached, unless explicitly disabled • In the first phase, the caching will be done in the center and the NUWeb CP cache spaces. Objects will be cached if accessed • The client will cache it in its cache spool, and an index will be sent to the center to notify the center that it has such object in cache. • In the second phase, the caching will be done by collaborative caching in the p2p space too, assuming that some of the personal sites are willing to participate. • The cache object will be indexed by GRN and MD5 • Note that if an object is modified, it will trigger a update to the global cache space to remove the original cache indexed by GRN • Each cache object will record a timestamp of the content (the time such content is created.)
GRI & Collaborative Proxy • GRI: • Object indexed by MD5-signature & GRN • Home page indexed by GRN • Instance indexed by MD5 • Syntax: • GRN: SUN#OBN • Distributed/Collaborative GRI • Multi-tier Collaborative Proxy
Indices (1) • In the nuweb center, there are several indices: • SiteDB: indexed by SSN • Last live time, access cnt, data size, • When alive, each site will periodically send alive info to the center (every K minutes) • NameDB: indexed using gaisindex • Each name is associated with a SSN by which we can check whether such page/object exists. • Each name will have a record, which will have a SSN value, and a GRN cache flag • In the search result of name db, if a record does not have a online instance (either roiginal site or the cache copy), it will have a flag indicating “not available”
Indices(2) • MD5 index, objects/pages indexed by MD5 signature. Each site will produce MD5 signatures for each object, and the (grn,md5) info will be sent to the center to be indexed.The return of a MD5 lookup is the source SSN/IP or the cache site/s IP • Page/document Content index • Indexed through gais search engine
NUWeb Portal Service • Search engine for the NUWeb cyberspace • Websites, pages, pictures, videos, documents, articles, etc., … • Browsing and Viewing • What’s hot, what’s new, what’s cool, • Automatically generated through page rendering tool based on a CountDB and list manager.
NUWeb: A New Web System, A Web Weaver! • A New Web System for Net Users to build their own Web Space. • A New Web System for a Community to build a Web space for the community members. • A solution for building the Web space within an IntraNet, in which the information sharing, community service, knowledge management, can be done easily and effectively. • Software_Goal: to be a WeBOSS, A Web Operating and Service System!
NUWeb Major Functions • Web Builder, Blog Builder/Manager • Sharing Management & Interactiion (分享與互動) • Community Engine • Search Engine and Caching (Backup) • Web/Knowledge Management
Web Builder, Blog Builder/Manager • To set up one's own website in minutes, without the need of • Static IP, • URL Name Application which needs yearly fee. • Professional knowledge in Website management • Managing a web site is something like managing one's directory space. Just define the skeleton, and then feed in the content by drag & drop, or input the content by powerful web content editor named Ooki. • Can create multimedia web content easily and efficiently. • Can set up one's blog in one's own server. NUBlog is implemented by AJAX, and provides a very handy and productive UI to create more rich content more efficiently. (*)
Web Builder, Blog Builder/Manager • NUBlog provides a multi-blog management function that lets the user manage one's blogs in different BSPs • One can create an article in the home blog, and then post it to multiple blog sites on the web easily. • By setting up a web server for every community member, the community become a web space where the information sharing and community services can be done easily and naturally. • The corporate management will be easier and in higher efficiency, as the manager can view the reports/documents/resources on its member's websites easily.
Sharing Management & Interactiion • Drag & Share, EasySharing • Internet/Web File Manager. • 分享權限設定 可名單限定,也可密碼保護, • MyPush, to share good stuff to friends/public easily and handily • 桌面寄送 (convenient for bug/problem report) • MyWatch, to watch for interested web content • Ubiquitous Subscription
Community Engine • Everyone can set up community easily, on one's own server or the NUWeb Community Center. • BBS/Forum powered by ooki editor • Vote Service, can support more flexible vote functions • Calendar • Address Book
Search Engine and Caching (Backup) • A search engine for the Intranet / community web space. Just like web search engine we will be able to use the search engine to search the shared information residing on every PC in the intranet/community. • Auto-Caching and Smart Backup
Web/Knowledge Management • Search Management • MyDB, a very easy-to-use personal DB manager for managing structured data • Ooki, a knowledge editing tool to create articles with more rich content. • 簡轉繁 • Grabbing / Crawling / Agents • Personal Relation Manager
NUWeb Components • WNS (Web Name Service) server, for resolving the website name to its physical IP/port address. • Cache/Proxy sevrer: providing cached web site accesses to those websites that are not on line at the time of accesses, as well as providing a gateway access for those websites within a firewall. • Search Engine, for searching the information in the NUWeb space. • NUBlog, a new Blog system that provides multi-blog management capability for the user to build a home-base for his/her multiple blogs around the web. • NUGroup, a community engine for users to build communities. • NUPedia, a knowledge editing and management system. • NUServer, a personal web server to be run on one's PC. • NUPush, a service for sharing interesting stuff on the web. • NUWatch, a service for monitoring interested web content. • NUDB, a simple DB system for net users. • NUMail, a unified message system • NUBraim, a browser and the user interface of the NUWeb Personal Portal.
Thanks http://tw.nuweb.cc