1 / 44

Project Presentation Document Optimization 11 May 2007

EDD Background. Electronic Data Discovery (EDD) is the systematic collection, processing and review of electronic files to support the litigation process. EDD is used in:Alleged stock-back datingGovernment reviews of mergers and acquisitionsOther dirty deals e.g. blackmail, fraud, embezzlement

iola
Download Presentation

Project Presentation Document Optimization 11 May 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Project Presentation Document Optimization 11 May 2007 Team members: Chris Catalano Chun-Yu Chang Chris Joson David Matthes Hello… We’re the Forensic Doc Optimization group. Introduce team membersHello… We’re the Forensic Doc Optimization group. Introduce team members

    2. EDD Background Electronic Data Discovery (EDD) is the systematic collection, processing and review of electronic files to support the litigation process. EDD is used in: Alleged stock-back dating Government reviews of mergers and acquisitions Other dirty deals e.g. blackmail, fraud, embezzlement The current processing system was designed for component flexibility and variability. The market place is shifting to an environment that holds speed and automation paramount.

    3. Project Objectives Evaluate the current EDD system against two alternatives. Client: Huron Consulting Group Evaluate SysML as an effective modeling language for systems engineering. Client: Aerospace Corporation

    4. Approach Modeled and compared three EDD systems in SysML. Evaluated the EDD systems from a capital budgeting perspective Evaluated quantitatively our experience with SysML.

    5. Agenda Approach - SysML Model Analysis - Trade Study Evaluation - SysML Usability Thank you Chris. Hi I’m Chun-Yu and I will be discussing the approach we took to meet the objectives given by our sponsors. As Chris mentioned earlier, one of the team’s objectives is to analyze and compare the performance of 3 different EDD systems. Because many of the team members did not have prior domain knowledge on electronic data discovery systems, we needed to find an approach to help us understand, capture, and communicate the intricacies of the EDD systems to each other as well as to the sponsors. The approach we took is to capture and model the details of the EDD systems using SysML. Thank you Chris. Hi I’m Chun-Yu and I will be discussing the approach we took to meet the objectives given by our sponsors. As Chris mentioned earlier, one of the team’s objectives is to analyze and compare the performance of 3 different EDD systems. Because many of the team members did not have prior domain knowledge on electronic data discovery systems, we needed to find an approach to help us understand, capture, and communicate the intricacies of the EDD systems to each other as well as to the sponsors. The approach we took is to capture and model the details of the EDD systems using SysML.

    6. Approach - SysML Model SysML is a modeling language that allows system designers to define, analyze and communicate different view points of a system with various diagrams in a model driven environment. The document optimization team used several SysML diagrams to model and gain insights into the EDD systems. I will be introducing the requirement diagram, the use case diagram, and the block definition diagram. My team mate Chris Joson will discuss the activities diagrams a little bit later. SysML is a modeling language that allows system designers to define, analyze and communicate different view points of a system with various diagrams in a model driven environment. The document optimization team used several SysML diagrams to model and gain insights into the EDD systems. I will be introducing the requirement diagram, the use case diagram, and the block definition diagram. My team mate Chris Joson will discuss the activities diagrams a little bit later.

    7. The Requirements perspective: One of the first steps of systems engineering is to understand the requirements: To capture the requirements of the EDD system. It provides the following information: Hierarchy or requirements (requirements can be decomposed) Traceability of requirements (requirements can be allocated to EDD components) This allows us to verify if we covered all the EDD specifications. The Requirements perspective: One of the first steps of systems engineering is to understand the requirements: To capture the requirements of the EDD system. It provides the following information: Hierarchy or requirements (requirements can be decomposed) Traceability of requirements (requirements can be allocated to EDD components) This allows us to verify if we covered all the EDD specifications.

    8. Contextual Perspective: We want to understand how the EDD systems is used: Provides us with a basic understand of the context of EDD. Shows how external entities interact with EDD. It shows the following: Customer provides data Processing Team processes the data Review Team reviews the results of the processed data Contextual Perspective: We want to understand how the EDD systems is used: Provides us with a basic understand of the context of EDD. Shows how external entities interact with EDD. It shows the following: Customer provides data Processing Team processes the data Review Team reviews the results of the processed data

    9. Structural Perspective: We want to understand the role of each component in EDD. To group and describe the characteristics and behaviors of components in the EDD System. This provides us with a understanding of the roles and functionalities each component plays in the EDD System. Structural Perspective: We want to understand the role of each component in EDD. To group and describe the characteristics and behaviors of components in the EDD System. This provides us with a understanding of the roles and functionalities each component plays in the EDD System.

    10. Activity Perspective: This is our Activity model that shows the flow of activities performed in the current EDD Process. It starts off with getting the native files and creating a source inventory. Then if there’s email, an Extraction program extracts all the email from Lotus Notes, Microsoft Outlook, etc. All the email and edocs are compiled into working data (WD). Then any archived data are unarchived, an inventory is taken, and any duplicate emails, documents, files, pictures are culled. A worker copies the WD to servers and another software program searches and indexes all of the data. The searchable/indexed WD is then checked to see if there are any excel files. Then a Format XLS macro makes all the excel spreadsheets have the same format (i.e. comma separated). Express then extracts TIFFs and Texts. Then Genjob, Gen Output, Infomatik, and Branded all format the TIFFs and Texts into the final delivery format. describes the activities of the EDD System Activity Perspective: This is our Activity model that shows the flow of activities performed in the current EDD Process. It starts off with getting the native files and creating a source inventory. Then if there’s email, an Extraction program extracts all the email from Lotus Notes, Microsoft Outlook, etc. All the email and edocs are compiled into working data (WD). Then any archived data are unarchived, an inventory is taken, and any duplicate emails, documents, files, pictures are culled. A worker copies the WD to servers and another software program searches and indexes all of the data. The searchable/indexed WD is then checked to see if there are any excel files. Then a Format XLS macro makes all the excel spreadsheets have the same format (i.e. comma separated). Express then extracts TIFFs and Texts. Then Genjob, Gen Output, Infomatik, and Branded all format the TIFFs and Texts into the final delivery format. describes the activities of the EDD System

    11. Partitions

    13. Advantages of Alternative Process Fewer manual steps Reduced probability of error Simpler to maintain Easier to train Less rigid process Shorter time to process documents

    14. Agenda Approach - SysML Model Analysis - Trade Study Evaluation - SysML Usability We’re going to present the following topics.We’re going to present the following topics.

    15. Net Present Value Probability Distribution The goal was to model the financial impact of each alternative over three years using Net Present Value (NPV). NPV is a capital budgeting technique used to estimate and compare cash flows for competing systems and projects. For each system the Net Cash Flow was decomposed, modeled, and run in a Monte Carlo simulation to generate NPV estimates. The results are NPV probability distributions for each alternative

    16. Net Present Value Compared to the baseline, the alternative systems increase the processing speed and the ability to accept projects. The trade off is increased costs. Autonomy: $2,000,000 initial cost $250,000 annual maintenance cost Attenex: $500 per gigabyte processed operational cost How does the increased ability to accept new projects and the increased costs impact the profitability of the systems?

    17. NPV – Results

    18. Conclusions & Recommendations The model shows that by increasing the opportunity to accept new projects the alternative systems can overcome the increased costs! The future system for Huron will be a hybrid of the alternatives. The process used for a particular project will be dependent on the clients’ requirements. The baseline system, while slower, provides a reliable and cost effective solution. For clients who choose higher speeds at higher costs Attenex would be an ideal fit. (Huron already owns licenses for the software!) It is critical to spread the costs of Autonomy across the three EDD groups. In effect distributing the responsibility for recouping the investment!

    19. Agenda Approach - SysML Model Analysis - Trade Study Evaluation - SysML Usability We’re going to present the following topics.We’re going to present the following topics.

    20. Purpose of the SysML Evaluation Aerospace asked us to evaluate SysML to determine how effectively SysML and Rational System Developer worked Evaluate SysML as a modeling language for designing systems Evaluate SysML maturity Determine how useful SysML is for systems engineering design and evaluation Evaluate IBM Rational System Developer Determine how well it supports SysML usage

    21. Approach: Survey Created a Multi-Attribute Utility Assessment Evaluation Hierarchy survey Survey contained 41 questions developed to assess the strengths and weaknesses of SysML and Rational System Developer Questions were answered on a 1 to 5 Likert scale with 5 indicating a positive response Surveyed 8 OR680 Students using SysML Electronic Data Discovery (EDD) Tactical Surveillance Satellite (TSS)

    22. Multi-Attribute Utility Assessment Evaluation Hierarchy Dr. Adelman provided the team with a multi-attribute utility assessment evaluation hierarchy. We are currently refining questions to make a questionnaire with ratings of 1 to 5 for all the SysML users to take so that we can attempt to measure SysML performance and usability.Dr. Adelman provided the team with a multi-attribute utility assessment evaluation hierarchy. We are currently refining questions to make a questionnaire with ratings of 1 to 5 for all the SysML users to take so that we can attempt to measure SysML performance and usability.

    23. Utility Results

    24. Survey Analysis SysML Strengths Overall respondents felt SysML was a good language Scored well in usability and flexibility Weaknesses The main weakness in SysML is that it is difficult to learn Respondents took 20-40 hours to become a functional user Rational System Developer Strengths Rational System Developer scored highest in usability Survey indicates that people found Rational System Developer fairly easy to use Weaknesses Survey indicted low scores for ease of training The Interface and product quality also scored lower than other areas

    25. Recommendation to Aerospace SysML SysML is difficult to learn and will require investment in training and time May not be practical for smaller systems or processes with limited complexity However, if people are already trained, SysML diagrams ensure consistency and provide effective communication across multiple disciplines Rational System Developer Rational supported the creation of models and helped maintain consistency Process descriptions were created and analysis performed using Rational and SysML SysML is well suited for complicated systems with significant hierarchical decomposition, systems common in the National Security Space domain

    26. Summary Huron asked us to evaluate their current EDD system and two alternatives Used SysML and NPV to perform the analysis Determined that the best solution is a mix of the current system for most clients and Autonomy for clients that require faster processing and can afford the increased cost Aerospace asked us to evaluate SysML to determine how effectively it can support system engineering design and analysis Conducted a survey to help answer this question. The survey found that SysML is a useful tool, but the learning curve is steep

    27. Acknowledgements Heather Howard, Shana Lloyd, and Julie Street, Aerospace Corporation Chris Genter, Huron Consulting Group Professor Laskey, George Mason University Sanford Friedenthal, Lockheed Martin Professor Adelman, George Mason University The TSS Team David Alexander, Kevin Sadeghian, Siroos Sekhavat, and Tom Saltysiak

    28. Future Work Optimize Parametric Diagram to make the model executable Run executable model Compare executable model results with results obtained from Microsoft Excel Distribute SysML survey to future students for a larger sample and further analysis

    29. Questions? Questions????Questions????

    30. Backup Questions????Questions????

    31. Decomposition of components to provide more detail. What are the detail activities in the operations? How the components in the systems performs the operations or activities will be described with the next set of diagrams (by chris). What are the attributes used for? We may choose to optimize some of the attributes for in the subcomponents of EDD.Decomposition of components to provide more detail. What are the detail activities in the operations? How the components in the systems performs the operations or activities will be described with the next set of diagrams (by chris). What are the attributes used for? We may choose to optimize some of the attributes for in the subcomponents of EDD.

    32. Parametric Diagram Parametric Diagrams were created to express constraints between value properties and allow to perform an executable model. Executable model used to provide analysis for performance, safety, reliability, throughput, weight, cost, etc. High Learning Curve Lack of Time (Estimation of >20+ additional hours to learn SysML limitations) Inexperience with Simulation Toolkit (Estimation of >30+ hours to execute with toolkit) Inexperienced team with Java (Estimation of >70+ hours to learn Java)

    33. Questions focus on either SysML as a language or IBM Rational System Developer as a tool Most questions will be rated on a scale of 1 to 5 Responses will be averaged together to determine a score for each category Sample Questions Overall, SysML improves the system design process. Rational System Developer provides feedback when processing user commands. SysML was easy to learn. I can easily add model elements to the System model. Sample Survey Questions

    34. Survey will have participant answer a series of questions

    35. Webpage mason.gmu.edu/~cchang7 Transition to webpage development.Transition to webpage development.

    36. This is a screen shot of our website. We have it running on our PCs. We just have to do the work to post this to a host site for the public. Demo of website to be performed in class from local PC.This is a screen shot of our website. We have it running on our PCs. We just have to do the work to post this to a host site for the public. Demo of website to be performed in class from local PC.

    37. General Status

    38. Schedule We’re still on schedule. Attenex model needs more workWe’re still on schedule. Attenex model needs more work

    39. NPV Backup

    40. NPV - Formula Where: t – time n – total project time r – discount rate Ct – net cash flow Co – Initial capital expenditures at time zero

    41. NPV - Assumptions Number of Projects Limitations: The number of projects entering into the system can not be greater than the maximum level of availability. Projects Start and Completion Time: All projects started in a month are assumed to be completed within that month. In practice this assumption can be interpreted as larger scale projects are started early in the month while smaller projects are started later in the month. Minimum Revenue: $2500 is the minimum amount of revenue accepted for a job. Autonomy Costs: The Autonomy system has an initial cost of $2 million dollars and an operational cost of $250,000 annually. Attenex Costs: The Attenex system has an operational cost of $500 dollars per GB processed. Prospective Projects: The level at which prospective projects are found is consistent for all systems. Availability Parameter: The availability parameter is being used to model the size and availability of the queue for incoming projects. Pricing Scheme: The pricing scheme is constant for each system over the three year period. No adjustments have been made to the pricing schemes of the higher cost alternatives. Migration Costs: With the exception of initial software costs, all migration costs are ignored in this model.

    42. NPV- Revenue Inputs (1) Annual Revenue: The annual revenue is the sum of twelve monthly revenue estimates. Monthly Revenue: The monthly revenue is the sum of the revenue for each job accepted and completed in a month. Revenue per Project: The revenue per project is the amount of revenue in dollars that a generated by a project. Projects Accepted: This value is the total number of projects entered into the system each month.

    43. NPV- Revenue Inputs (2) Maximum level of System Availability: The maximum level of system availability is the largest number of projects that can enter into the system each month. Number of Prospective Projects: The number of prospective projects describes the number of projects that are available to be entered into the system. Number of Staff: The number of staff plays a critical role in limiting the number of jobs that can be entered into the system each month. Processing Speed: Processing speed describes the rate at which projects can be pulled through the system.

    44. NPV- Cost Inputs Initial Costs: The costs used to procure new software and equipment for the alternative systems at the onset of the migration. The initial costs are incurred once at the beginning of the project. Maintenance Costs: Monthly costs associated with maintaining the software and hardware systems. The maintenance costs include repairing machines, software upkeep and spare parts. Salary Costs: Monthly costs related to employee salaries. Operational: Monthly costs related to procuring additional equipment, software and the overhead costs related to the building and facilities.

    45. NPV – Parametric Diagram

More Related