300 likes | 328 Views
Explore predictive coding technology, its process, reliability, defensibility, and human involvement in E-Discovery. Learn about typical workflows, judicial opinions, precision, recall, and case examples.
E N D
E-Discovery and Predictive Coding A Conversation With In-House and Outside Counsel
Agenda What is Predictive Coding? Overview of the Process Reliability and Defensibility Making it Work
Before Predictive Coding Technology-Assisted Review is Not New Post-It® Notes Black markers Highlighters White-out tape Etc.
Before Predictive Coding Technology-Assisted Review is Not New Coding Forms Indexing and Searching Hashing Duplicate and Near-Duplicate Identification Email Threading Clustering Concept Searching Workflow Functionality
Why Predictive Coding? Explosion of Electronically Stored Information Amendments to the Federal Rules (2006) Nearly Impossible to Print and Review Volume Objectives of Leveraging Technology Capture Work Product Save Time Reduce Cost Increase Quality
What is Predictive Coding? Terminology Predictive Coding Machine Learning Technology Assisted Review Software Assisted Review Suggestive Coding
What is Predictive Coding? What Makes the Next Generation Tools Different?
How Does It Work? Typical Process Counsel identifies the data set Counsel categorizes selected documents Computer learns from counsel’s decisions Computer suggests additional documents Counsel categorizes additional documents Repeat until desired level of precision and recall achieved
How Does it Work? What about the humans?
What About the Humans? “…you cannot just dispense with final manual review. . . . we are not going to turn that over to the Borg anytime soon. I’ve asked around and no law firms do that now. No experts advocate that approach either, even the most extreme advocates for automation (of which I’m one). . . only a fool (or con artist trying to get at a producing parties (sic) secrets) trusts coding software today without human verification.” Ralph Losey, “Bottom Line Driven Proportional Review,” Jan. 15, 2012, available at www.e-discoveryteam.com.
What About the Humans? “Discovery cannot be wholly automated, not for the reason that it involves so-called subjective judgment, but because ultimately attorneys and parties in the case have to know what the data are about. They have to formulate and respond to arguments and develop a strategy for winning the case. They have to understand the evidence that they have available and be able to refute contrary evidence. All of this takes knowledge of the case, the law, and much more.” Kershaw, Roitblat and Oot, “Document Categorization in Legal E-Discovery: Computer Classification vs. Manual Review,” Journal of the American Society for Information Science and Technology, 61(1):70–80, 2010
What About the Humans? Understanding corpus Knowledgable “experts” / “SME(s)” (e.g. to identify small sets of documents representative of each issue to be coded Human review of tiered data sets Human quality control processes Human evaluation of sampling and testing results Project management
Is it Reliable? What Does It Mean to Be Reliable? Precision: Percentage of Retrieved Results that are Relevant Recall: Percentage of Relevant Results that are Retrieved How Do We Confirm Reliability in a Data Set? Review It Statistical Sampling
Is it Defensible? What Does It Mean to Be Defensible? Judicial opinions Court of public opinion
Is it Defensible? "Depending on the circumstances, a party that uses advanced analytical software applications and linguistic tools in screening for privilege and work product may be found to have taken ‘reasonable steps’ to prevent inadvertent disclosure." Federal Rule of Evidence, Rule 502, comments.
Disability Rights Counsel Facts: Information Deleted During Litigation Requesting Party Asks for Backup Tapes Magistrate Judge John Facciola: Suggested Concept Searching In Addition to Traditional Search Terms Disability Rights Counsel of Greater Wash. v. Wash. Metro. Transit Auth., 242 F.R.D. 139 (D.D.C. 2007).
Victor Stanley Facts: 165 Purportedly Privileged Documents Produced Allegation of Privileged Waiver Magistrate Judge Paul Grimm: Keyword Search Terms are a Useful Tool There is a Danger in Using Search Terms Incorrectly Important to Use Statistical Sampling to Validate Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251 (D. Md. 2008).
Dilley “Court must limit discovery if it determines that the burden or expense of the proposed discovery outweighs its likely benefit.” Dilley v. Metropolitan Life Ins. Co., 256 F.R.D. 643, 644 (N.D. Cal. 2009)
Daugherty Facts Plaintiff Requests Data Extract Costing $100,000 Defendant Counters With a $36,000 Extract Held Court Agreed With Defendant’s Cheaper Proposal Plaintiff Could Not Demonstrate Their Proposal Was “Dramatically Different” Daugherty v. Murphy, 2010 WL 4877720 (S.D. Ind. Nov. 2, 2010).
Wood Facts Parties Engaged in Considerable Discovery Plaintiff Requested Defendant Run Search Terms on 45 Custodians Data: Result 1,753,537 Documents Held Court Relied on Rule 26(b)(2)(C)(iii) in Issuing a Protective Order Based on “questionable relevance” Wood v. Capital One Svcs., LLC, 2011 WL 2154279, (N.D.N.Y. Apr. 15, 2011).
Da Silva Moore v. Publicis Groupe “This judicial opinion now recognizes that computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases.” “The technology exists and should be used where appropriate, but it is not a case of machine replacing humans: it is the process used and the interaction of man and machine that the court needs to examine.” Da Silva Moore v. Publicis Groupe, Case No. 11 Civ. 1279 (S.D.N.Y. Feb. 24, 2012)
Da Silva Moore Key Takeaways Process Transparency Proportionality Cooperation Competence
Kleen Products v. Packaging Corp. • Can the producing party be required to use “predictive coding”? • Compare Da Silva – parties agreed to use but not protocol • Is keyword searching “dead,” as plaintiffs appear to contend? • Hearings to continue in coming weeks
Other Cases Ford Motor Co. v. Edgewood Properties, Inc., 2009 WL 1416223 (D.N.J. May 19, 2009) In re Exxon Corp., 208 S.W.3d 70; 2006 Tex. App. LEXIS 8768 (October 12, 2006) Datel Holdings v. Microsoft Corp.; 2011 U.S. Dist. LEXIS 30872 ( N.D. Cal. Mar. 11, 2011) Multiven, Inc. v. Cisco Systems, 2010 WL 2813618 (N.D. Cal. July 9, 2010)
Making it Work Rule 26(f) Meet & Confer Discuss review methodology Transparency about the process Consider joint sampling review
Making it Work Staffing Understand Pricing Selecting a Vendor / Technology Defining Success Overcoming Objections
Making it Work Prepare for Review Develop Review Plan Coding Form Workflow Functionality Batching Duplicate and Near-Duplicate Identification Clustering Email Threading Concept Searching Finalize Document Coding Guidelines
Making it Work • Quality Control Throughout the Process • “Garbage in, garbage out” – quality of initial attorney coding for system training is critical • Consistency – replication of the “good” and “bad” coding decisions • QC documents “left behind” – is the non-relevant set truly non-relevant? • Thorough understanding of Risk Tolerance in the case • Experienced project management