1 / 68

Processing and Analyzing Electronic Data

Processing and Analyzing Electronic Data. Arizona Paralegal Association Phoenix, September 12, 2006. Cliff Shnier, JD Director, Business Development Cataphora Inc Scottsdale, AZ 480-661-6183 cliff@cataphora.com. The New Rules: they’re He-e-e-e-ere!.

glain
Download Presentation

Processing and Analyzing Electronic Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Processing and Analyzing Electronic Data Arizona Paralegal Association Phoenix, September 12, 2006 Cliff Shnier, JD Director, Business Development Cataphora Inc Scottsdale, AZ 480-661-6183 cliff@cataphora.com

  2. The New Rules: they’re He-e-e-e-ere! • The Supreme Court approved the changes and transmitted them to Congress on April 12, 2006. • All that’s needed is the enabling legislation. • These rule changes affect Rules 16, 26, 33, 34, 37, 45 and Form 35.

  3. FRCP 26(a) [amended] Rule 26. General Provisions Governing Discovery; Duty of Disclosure • REQUIRED DISCLOSURES… a party must, without awaiting a discovery request, provide a copy of all documents, electronically stored information, and tangible things… in its control that it may use to support its claim or defense. • This is as far as many countries go! • Quebec Code of Civil Procedure, Art. 331.1 and 402 – 403; • French Code of Civil Procedure, Article 753.

  4. English Translation of Art. 753 of French Code of Civil Procedure: • Pleadings shall set out expressly the claims of the parties as well as the issues of law and fact which are the basis of each claim. A memorandum listing the documents in support of these claims shall be annexed to the pleadings. • And that’s all you have to produce, mon ami.

  5. FRCP 26(b) • (b) DISCOVERY SCOPE AND LIMITS. … the scope of discovery is as follows: • (1) In General. Parties may obtain discovery regarding any matter, not privileged,...relevant to the claim or defense of any party, including… any books, documents... This is much more than 26(a), and This is where the U.S. goes much further than most other jurisdictions.

  6. The view from “over there” • “In the United Kingdom, extensive American-style discovery is viewed a cultural anomaly and a wasteful extravagance. Computer-based discovery is viewed as particularly obtrusive.” • Ken Withers, address at the University of Edinburgh, April 2001.

  7. “exponentially greater volume” From page 22 of the Commentary by the Rules Committee, Sept 2005

  8. 1 MB = roughly 75 pages 1 GB = roughly 75,000 pages Therefore, 30 Gb = 2.25M pgs, = 1000 boxes = 250 lineal feet of five-tier shelves = 50 file cabinets. “On a single ten square inch hard drive, more data can be stored than would fit on the entire floor of a building.” Arkfeld, Electronic Discovery and Evidence, p 1-9, quoting a 1999 article by Kimberly Richard in 21Whittier L.Rev. 463 Electronic Data Volumes • QUIZ: A company with 10,000 employees generates 2.5 million e-mail messages per: • ___ Year? ____ Month? ___ Week?

  9. We had PCs since the early 80’s • So why didn’t e-Discovery show up until the mid -1990’s?

  10. The answer of course, is… • Connectivity -- the Internet! • Until mid-90’s, computers were just tools to create paper documents. • Then very quickly, business switched to written communication without paper. • e-mail replaced paper (and fax). • By 2000, paper “a superfluous by-product.” • e-mail even replaced the telephone. “Many informal messages that were previously relayed by telephone or at the water cooler are now sent via email.” Byers v Illinois State Police (N.D. Ill, 2002)

  11. The explosion of electronic data 3.400 trillion • Over 95% of corporate documents are now electronic • Email has become indispensable • All electronic documents are discoverable • No more “I won’t ask if you won’t ask”. They’re asking. U.S. Corporate E-mail Volume Growth Trillions . . . Source: Wall Street Journal, January 10, 2000; IDC

  12. The path not taken… • The committee might easily have decided that broad scope was no longer tenable. • Instead, they mostly preserved modern US-style broad discovery • and recognized that technology, the source of the problem, is also the source of the solution …

  13. FRCP 26(f): “Meet and Confer” Rule 26. General Provisions Governing Discovery; Duty of Disclosure • (f) to discuss any issues relating to preserving discoverable information…and to develop a proposed discovery plan concerning… • (3) discovery of electronically stored information including the form/s in which it should be produced… • (4) relating to privilege or protection as trial-preparation including asserting such claims after [inadvertent] production

  14. Collect Physical collection (or delivery) of documents Organize Photocopy or Scan, Bates number, track documents in boxes or, by 90’s, code into database Review Evaluate for production Decide relevance Decide privilege Produce Ultimate physical delivery of documents; receiving from other side The Stages of Discovery when it was Paper 1 2 3 4

  15. Collect How to find/ copy/compile responsive ESI? Preserve Ensure the electronic data you need is kept intact Organize How to process ESI (and its much greater volume) so you can review it and utilize it? Review How to review ESI (and its much greater volume?) Except with Electronic Data, there’s also an earlier step -- Preservation 0 1 2 3

  16. Obligation to Preserve (1)

  17. Obligation to preserve (2):

  18. Collect How to find/ copy/compile responsive EDD? Organize How to process EDD (and its much greater volume) so you can review it and utilize it? Review How to review EDD (and its much greater volume?) Produce What is the best method for producing EDD? (and how would you like to receive it?) The Stages of Discovery: the challenges when the information is Electronic 1 2 3 4

  19. Collect How to find/ copy/compile responsive EDD? Organize How to process EDD (and its much greater volume) so you can review it and utilize it? Review How to review EDD (and its much greater volume?) Produce What is the best method for producing EDD? The Stages of Discovery: Moving on from “Step Zero”, Preservation, to “Step 1”, Collection 1 2 3 4

  20. After you’ve collected the electronic data… • “…remember, that’s all you’ve got at that point. A whole lot of messy electronic data.” William Cwiklo, Panelist on Electronic Data Discovery, Glasser LegalWorks, Fairmont Hotel, San Francisco, February 1999.

  21. Collect How to find/ copy/compile responsive EDD? Organize How to process E-Data (and its much greater volume) so you can review it and use it? Review How to review EDD (and its much greater volume?) Produce What is the best method for producing EDD? Discovery Stage 2: Organize that Electronic Data – meaning Process it somehow to make it useable 1 2 3 4

  22. The Options for Processing Electronic Data (1-2) • 1. Print Everything: Print out the entire collection (from native app) and review paper for relevancy. • 2. Print->Scan->Code: The “1997” model “In the shift to a new medium, the content reflects the previous medium.” -- Marshall McLuhan Example: the first ten years of television were visual radio. (Acknowledgment to Michelle Ostrom of Attenex.)http://www.mcluhan.utoronto.ca/mcluhanprojekt/allen2.htm

  23. 1997 processing: Print-Scan-Code; Electronic to Paper  to Electronic Paralegal/Word Processing Print out all files Paper Scanner Results Responsive review (OCR) Coder Litigation Database Production

  24. The Options for Processing Electronic Data (3)Why “process” electronic data at all? • 1. Print Everything: Print out the entire collection (from native app) and review paper for relevancy. • 2. Print->Scan->Code: The “1997” model • 3. “Do Nothing”: Review each custodian’s files in their Native format, and using the Native application software itself. • So what’s wrong with “doing nothing”?

  25. The No-Process “Do nothing” approach: Using Outlook to review Outlook • No tagging, No annotating • No Redacting • Merely moving the data to another machine changes its appearance.

  26. Using Outlook to review Outlook:“Advanced Find” • Slow • Limited search flexibility • Responses are simply a listing of e-mails – can’t format reports • Will NOT search attachments

  27. The Options for Processing Electronic Data (4) • 1. Print Everything: Print out the entire collection (from native app) and review paper for relevancy. • 2. Print->Scan->Code: The “1997” model • 3. “Do Nothing”: Review each custodian’s files in their Native format, and using the Native application software itself. • 4. Convert (‘process’) electronic data to another electronic form better suited to reviewing: Then review entire collection either with in-house litigation support software or on-line through an ASP Repository

  28. Processing Electronic Data – Conversion to TIFF in the late 1990’s • Conversion of e-mails and e-docs to: • a TIFF image, linked to • indexed bibliographic information; • with full text; • and maintains parent/attachment relation. • A faster, cheaper way to convert e-data to the model we had gotten used to with paper – a database record linked to a scanned image.

  29. Collect How to find/ copy/compile responsive EDD? Process How to process E-Data so you can review it and use it? The answer for a while was Convert to TIFF Review How to review EDD (and its much greater volume?) Produce What is the best method for producing EDD? By 1999, processing that Electronic Data meant Converting it to TIFF 1 3 4 2

  30. But the volume kept growing! 3.400 trillion U.S. Corporate E-mail Volume Growth Trillions . . . Source: Wall Street Journal, January 10, 2000; IDC

  31. Sedona ConferenceSearch and Information Retrieval, Principle 1: In litigation… where the volume of discoverable electronically stored information is large, it may not be feasible to perform human review of every document for responsiveness or privilege, and automated search and information retrieval methods and tools may be necessary and valuable. This isn’t just a brainstorm of words and phrases.

  32. Courts now expect automated processes to identify responsive data • “A responding party may satisfy its good faith obligation to preserve and produce potentially responsive electronic data and documents by using electronic tools and processes, such as data sampling, searching, or the use of selection criteria, to identify data most likely to contain responsive information.” (emphasis added) • Zakre v. Norddeutsche Landesbank Girozentrale, 2004 WL 764895 (S.D.N.Y. Apr. 9, 2004) adopting Sedona Principle 11 verbatim.

  33. Automated tools in e-discovery • De-duplication • Keywords and Boolean • Statistical Clustering • Natural Language and fuzzy searching • Concept search tools • Taxonomies and Ontologies

  34. Attenex Autonomy Cataphora Dolphin Search Engenium Guidance Stratify Syngence “Search Engine” Software

  35. RelationshipAnalysis RelationshipAnalysis documents withcausal or sequential relationship documents withcausal or sequential relationship Social Network Analysis Social Network Analysis Social Network Analysis relationships among relevant people relationships among relevant people relationships among relevant people Ontology Ontology Ontology Ontology Clustering Clustering Clustering Clustering generalized words or phrases generalized words or phrases similarity of salient features similarity of salient features generalized words or phrases generalized words or phrases similarity of salient features similarity of salient features Keyword Keyword Keyword Keyword Keyword specific exact words specific exact words specific exact words specific exact words specific exact words Approaches to Data Organization Context Concept Content

  36. A Simple Ontology • ROYALTY CONCEPT • Royalty • Commission • Honorarium • Usage Fee • Slice of the Pie

  37. A More Realistic Ontology • charge for use • charged for use • charging for use • charges for use • licence fee • license fee • lisense fee • “take cut”~2 • “takes cut”~2 • “took cut”~2 • “slice pie”~5 • “piece pie”~5 • “piece action”~5 • “slice action”~5 • -king • -queen • -prince • -princess • ROYALTY CONCEPT • royalty • royalties • rty • commission • commissions • comm. • honorarium • honorariums • honoraria • usage fee • usage charge • usg fee • use fee • fee for use • fee for usage • incent* • insent*

  38. Reviewing the Right Data Duplicates 25% Intake Data 100% Non-Responsive (NR) Junk 20% (Spam/Jokes/etc.) NR Business 20% NR Personal 20% Privileged 3% Relevant & Responsive 12% Estimates: These figures vary based upon the data set received

  39. Getting to Responsive DataKeywords versus Ontologies Reviewable 1.575 “Responsive” to Keywords0.842 FinalOntology PassResponsive0.109 All numbers in millions of items

  40. Yet for all that breadth, keywords still miss vital documents! 8,553 responsive documentsmissed by keyword search (Almost 8% of responsivedocuments missed bykeyword search)

  41. Cost and Time Savings • Cost to review “keyword” docs: $2,526,000 • Cost to process, create ontologies and review docs found by them: $1,621,076 • Net cost savings: $904,924 • Keyword review time: Over 11 weeks • Ontology time: 6 weeks or less including both review and processing time

  42. The end-product of Processing now • Less data • Standardized so each e-mail, each attachment, each free-standing electronic file will have: • A “database record” linked to • The data itself in its native format • and/or with other renderings, a TIFF or PDF image. • So no longer is it “a whole lot of messy electronic data”

  43. The EDD industry abounds with “me-too” newcomers. This guy can’t be your expert witness But do it right, with the right people! EDD done here!

  44. The Evolution of Electronic Discovery Processing Print and Review 1995 AD Print, Scan, Code, Review 1997 TIFF And Review Circa 1999 Simple Filtering 2001 Analytical Defensible Reliable Reduction, then Review, 2004- Keyword Searching 2002

  45. Collection How to find/ copy/compile responsive EDD? Organization How to process EDD so you can review and utilize? Review How to review EDD (and its substantially greater volume?) Production What is the best method for producing EDD? Discovery Stage 3: Review 1 2 3 4

  46. Full review is rarely as accurate as automated searching. Humans make errors, get distracted, bored and tired. Typical human error rate is 25% And expense of human review of every document in dollars and time is prohibitive. “But I only trust humans looking at every document -- it’s tried and true”

  47. No manual review of millions of documents is cost-effective or accurate • After culling by whatever means, you’ve still got quite a lot. • Use computing power to enhance review • Grouping data, multiple document decisions at once • Workflow / QA can accelerate and improve quality

More Related