1 / 22

PDSC: P2P Document Sharing Community

PDSC: P2P Document Sharing Community. Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯 QA. Introduction. The original idea comes from research groups such as CML Laboratory of NTU.

Download Presentation

PDSC: P2P Document Sharing Community

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PDSC: P2P DocumentSharing Community Team No. 4 R91922001 黃振修 PM R91922020 羅婉琪 RD B89902012 葉家齊 RD R91725032 李宜儒 RD R91922015 張燕君 QA R91922028 張靜雯 QA

  2. Introduction • The original idea comes from research groups such as CML Laboratory of NTU. • People want to share their document over Internet and need the functionality of keyword search. • Thus we need a peer-to-peer mechanism for document exchange to achieve the goal of knowledge management. And we also need full text search to find/filter the sharing resources before downloading.

  3. Features • Peer-to-peer document sharing over Internet. • Full text keyword searching / search result ranking within community. • Direct document exchange by sending to and downloading from others. • We developed our own URL format • Ex: dsc://download/hostname/path/to/file

  4. Market Requirement • A simple application can be installed to connect to the community. • Entering/leaving the community at any time. • Sharing documents with each other. • The sharing resources must keep up to date. • Easy to see what's on the community. • User can enter keywords to search the community for documents. • User can direct send files with each other.

  5. Project Roadmap • Version 1.0 • Basic functionality: • Version 2.0 • Duplication multi-copies in community • Provide central backup mechanism • Version 3.0 • User management/authentication • User acknowledge of document exchange • More document formats will be supported in the future

  6. Stage Goals • Stage 1: Community browsing • Stage 2: Search functionality • Stage 3: Download/send file functionality

  7. Schedule Notes • 5/3: 黃振修 should finish the document digest module • 5/10: 葉家齊 should finish the architecture prototype and server side protocol communication • 5/12: 羅婉琪 should finish the client browsing functionality • 5/10: QA finishes doc conversion testing (binary/code) • 5/10: 李宜儒 should finish Win32 file hook mechanism • 5/13: Download/send file should be OK • 5/24: Document search QA finishes testing • 5/24: The search result should be OK • 5/28 ~ 6/4: Code freeze and final testing

  8. Project Meetings • Two types of meetings are defined: • [PRJ]: Project meeting • [DEV]: Developing meeting • Meeting dates: • [PRJ] 4/15, Tue. R319 of CSIE building • [DEV] 4/23, Wed. R505 of CSIE building • [DEV] 4/28, Mon, R505 of CSIE building • [DEV] 4/29, Tue. R107 of CSIE building • [DEV] 4/29, Tue. R105 of CSIE building • [DEV] 5/6, Tue. R519 of CSIE building • [PRJ] 6/9, Mon. R503 of CSIE building After 5/6, no formal meeting is held until the final. Instead, several small meetings are held in QAs and RDs; sometimes PM also calls RDs and QAs to cooperate.

  9. Documentation • MRD: Market requirement Document [PM] • PRD: Project Requirement Document [PM] • PED: Project Execution Document [PM] • PDD: Project Development Document [RD] • QAD: Quality Assurance Document [QA] • BTD: Bug Tracking Document [QA] • WDD: Working Discussion Document [PM] • User’s Manual [QA]

  10. Development Tools • Microsoft VC++ 6.0 • Borland C++ Builder • CVS for source control • Central FTP server for file exchange • Install Shield for SETUP program

  11. Graphics User Interface Kernel Protocol API for GUI Client Host Lookup Thread Server Server Thread Document Keyword Processor Database Local Shared File Database Host Database Task Database Architecture

  12. Technical Notes (1/2) See PDD for more detail • Pure peer-to-peer mechanism is implemented. Each application embeds both the client and server. (for the efficiency reason) • When search request issued, the application will search its own document collection and then forward the message to other computers • Dynamically monitoring of the sharing folder. Once the documents in the sharing folder are modified, the digest module will re-digest it real-time; keeping the latest information toward the community.

  13. Technical Notes (2/2) See PDD for more detail • Support three main document formats: MS Word, MS PowerPoint, and PDF files. (No Chinese support) • Digest is the technique used to extract document’s feature vector. Searching is based on those digest vectors. • An algorithm is developed to rate the searching and the result is ranked according to the points. • Digest for the sharing documents are saved once exiting the program; only first time initialization is needed.

  14. Demonstration

  15. Testing Plans See QAD for more detail • What is to be tested? • Platform • Network status • Command • File Conversion • Download/Upload • Where is going to be tested? • Win32 environment, Windows 2000 OS • PIII 500 CPU, 256 MB RAM, 100 Mbps ethernet

  16. Testing Cases See QAD for more detail • Document format conversion (binary tools testing) • Document format conversion (integrated as program module, test for robustness and accuracy) • P2P sharing community (test for the feature functionalities for UI program) • The sharing module (test for the digest/searching and sharing folder monitoring) • Setup program (test for the installer’s functionality) • Performance report (memory usage, CPU utilization, memory leak)

  17. Bug Tracking (1/2) See BTD for more detail • Empty document files may cause fatal error • Solved by check file completeness first. • Some PDF file may cause the conversion module to get the wrong page number, causing fatal error. • Check the validity of page number first. • Duplication list when browsing • Stupid bug • Get file list waits too long • Stupid bug

  18. Bug Tracking (2/2) See BTD for more detail • Download/sending file too slow • Stupid bug (sleeping in the sending loop) • Can not get file list/browsing when clients using DHCP • Not solved because of the time limit. • Keyword search in sharing folder do not recursively applied • Solved by writing the recursive code • Keyword search is too slow • Improve the algorithm

  19. Bug Statistics 5/3 5/7 5/10 5/24 6/8 6/9

  20. Change Control History • Change from client-server architecture to peer-to-peer architecture [4/23] • Change the document digest from full-text to digest vector based. [5/6] • Decide to allow recursively sharing in sharing folder [6/1]

  21. Future Plan • Version 2.0 • Duplication multi-copies in community • Provide central backup mechanism • Version 3.0 • User management/authentication • User acknowledge of document exchange • Bug fix and support for more document formats

  22. The END Project Shipping Checklist: • Source Code • Include all surveyed components, CVS repository. • Development Document • MRD, PRD, PED, PDD, QAD, BTD, and WDD • User’s Manual • Presentation file • Install Program • Project CD with all the stuff

More Related