1 / 25

ESTMD System -- A Web-based EST Model Database System

Develop a user-friendly Expressed Sequence Tags model database system to aid biology scientists in searching expression sequences and making informed decisions. Utilizes technologies like HTML, Java Servlets, and MySQL for efficient data handling.

Download Presentation

ESTMD System -- A Web-based EST Model Database System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ESTMD System -- A Web-based EST Model Database System Yinghua Dong MSE Presentation I

  2. Outline • Project Overview • Requirements • Cost Estimation • Project Plan • Potential Risks • Demonstration • References • Acknowledgments MSE Presentation I

  3. Project Overview -- Objective Build a web-based, user-friendly Expressed Sequence Tags model database (ESTMD) system to help biology scientists search expression sequences and related information to make further decision MSE Presentation I

  4. Project Overview -- Background • ESTs: Expressed Sequence Tags, are partial sequences of randomly chosen cDNA, obtained from the results of a single DNA sequencing reaction. Typically, EST processing includes raw sequence cleaning, cleaned sequence assembling, and unique sequence annotation and functional assignment. Trace Files Cleaned (EST) sequences Unique sequences with hit Cap3 Cross_match & PERL program Blast Raw (clone) sequences Assembled (unique) sequences Phred MSE Presentation I

  5. Project Overview -- Background (cont’d) • Gene Ontology A set of controlled vocabularies used to describe biological features within a specified domain of biological knowledge.Gene Ontology describes the molecular functions, biological processes and cellular components of gene. • Pathway The sequence of enzyme catalyzed reactions by which an energy-yielding substance is utilized by protoplasm. MSE Presentation I

  6. Project Overview -- System Architecture • Client Tier Responsible for presenting data, and receiving user inputs • Application-server Tier Responsible for recording and abstracting business processes • Data-server Tier Responsible for data storage Three-tier Architecture MSE Presentation I

  7. Project Overview-- Technologies and Tools • HTML with JavaScript will be used to build client interfaces • Java Servlets, JSP (Java ServerPage) and JDBC will be used on the server-side • XML and XSLT will be used to describe and present Gene Ontology tree structure • MySQL4.0 is chosen as database management system MSE Presentation I

  8. Project Overview-- Technologies and Tools (cont’d) • JBuilder Enterprise9 is used as development tool • Rational Rose is used to create UML models • MS-Project is used for project plan • Some verification and validation software (such as Alloy, USE, or SPIN) will be used for formal requirement specification MSE Presentation I

  9. Project Overview-- E-R Model MSE Presentation I

  10. Requirements MSE Presentation I

  11. Requirements (cont’d) • Search in Detail • Users search detail information by gene name or symbol, sequence ID, FlyBase ID, or GenBank ID • Users can decide the fields shown in the result • The output format is html/text (A sample output is shown on the right side) • unisequenceID: Contig1 • uniSeq: CGCGGCCGCGTCGACGAGATTCGGAGGTTAGAAACATGACTCGCAAACGCCGTAATGGAGGACGGGCTAAGCACGGCCGTGGCCACGTTAAGGCGGTGAGATGCACCAACTGCGCGCGTTGCGTGCCTAAGGACAAAGCTATCAAAAAGTTCGTGATCAGGAATATTGTCGAAGCGGCTGCCGTCAGGGATATCAACGAAGCTTCCGTATATGCATCATTCCAGCTGCCGAAGCTGTATGCAAAGCTCCACTACTGCGTCTCCTGCGCCATCCACAGCAAAGTTGTGCGCAACAGGTCTAAGAAGGACAGGAGAATCCGCACACCACCCAAGAGCACCTTCCCCAGGGACATGCAGCGCCCACAGAATGTGCAAAGGAAGTGAAGTGATTTACAATAAATTTTAAGAAAACCC • flybaseID: FBgn0004413 • evalue: 2.00E-49 • hitLength: 114 • bitScore: 190 • identity: 93/115 MSE Presentation I

  12. Requirements (cont’d) • Search by Keyword • Users search the sequences at each stage by keyword • The output includes sequence ID, length (with a link to sequence), gene name, symbol and a link to contig view image • A sample output MSE Presentation I

  13. Requirements (cont’d) • Gene Ontology Search • Users search gene ontology information by gene names, symbols, IDs, or a text file. • The output is a table including GO ID, term, type, sequence ID, hit ID, and gene symbol. • The hyperlinks on terms can show gene ontology tree structure. • A sample output MSE Presentation I

  14. Requirements (cont’d) • Gene Ontology Classification • Users input a batch of gene names/symbols, or a local text file containing sequence IDs. • Users can choose the gene ontology types which they want to classify. • The output is a table including gene ontology type, subtype, sequence count, and percentage of sequences. • A sample output MSE Presentation I

  15. Cost Estimation The effort of the project is estimated by • Function Point Analysis (FPA) • COCOMO II Model MSE Presentation I

  16. Cost Estimation-- Function Point Analysis • Unadjusted Function Points MSE Presentation I

  17. Cost Estimation-- Function Point Analysis (cont’d) • Function Point Analysis • Total Unadjusted Function Points (UFP) = 138 • Product Complexity Adjustment (PC) = 0.65 + (0.01× 40) = 1.05 • Total Adjusted Function Points (FP) = UFP × PC = 144.9 • Language Factor (LF) for Java assumed as 35 • Source Lines of Code (SLOC) = FP × LF = 5071.5 MSE Presentation I

  18. Cost Estimation-- COCOMO II For application programs: • Delivered Source Instructions (KDSI) = 5.0715 • Programmer Effect (PM) = 2.4 × (KDSI) 1.05 = 13.2 person-month • Development Time in month (TDEV) = 2.5 × (PM) 0.38 = 6.66 months MSE Presentation I

  19. Project Plan • Phase I: Requirement ( 1/12/04 ~3/1/04) • Phase II: Design (2/23/04 ~ 4/23/04) • Phase III: Implementation and Test (4/26/04 ~ 7/30/04) MSE Presentation I

  20. Project Plan (cont’d) MSE Presentation I

  21. Potential Risks • The requirements may change continually • Some biology knowledge is needed • Some new technologies, such as XML, XSLT, need to be leaned MSE Presentation I

  22. Demonstration http://129.130.115.72:8080/estmd/index.html MSE Presentation I

  23. References • IEEE STD 830-1998, IEEE Recommended Practice for Software Requirements Specifications, 1998 Edition, IEEE, 1998 • IEEE Standard for SW Quality Assurance Plans (IEEE Std 730-1998) • Walker Royce, Software Project Management -- A United Framework, 1998 • Marty Hall, Core Servlets and JavaServer Pages, 2000 • Roger. S. Pressman, Software Engineering: A practitioner’s Approach, 5th Edition. • Dr. Gustafson, CIS 540 lecture • http://sunset.usc.edu/research/COCOMOII/index.html MSE Presentation I

  24. Acknowledgments Committee: • Dr. Mitchell L. Neilsen • Dr. Gurdip Singh • Dr. Daniel Andresen MSE Presentation I

  25. Suggestions and Comments Thank You! MSE Presentation I

More Related