1 / 25

Data Quality: Opportunities, Data, and Examples

Data Quality: Opportunities, Data, and Examples. Better and More Data. Level of analysis Take a quick look at what/why use data Linking data from disparate and third party sources Explore data types Typical issues & Tricks Cross validation and sourcing Reverse Look-up GIS layering

Download Presentation

Data Quality: Opportunities, Data, and Examples

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Quality:Opportunities, Data, and Examples

  2. Better and More Data • Level of analysis • Take a quick look at what/why use data • Linking data from disparate and third party sources • Explore data types • Typical issues & Tricks • Cross validation and sourcing • Reverse Look-up • GIS layering • Backfill from text correlated to codes • Information from operations • Text analytics

  3. Producer Segmentation Market Planning Revenue Forecasting Cross sell and Up sell Retention and Profitability Sales and Distribution Underwriting Claims Risk Selection and Pricing Portfolio Management Premium Adequacy Billing and Collections Management Payment Accuracy Claim Collaboration > Fraud Detection > Subrogation > Risk Transfer > 3rd Party Deductible > Reinsurance Recoverable General Organizational Overview An information business focused on risk taking. Make. Sell. Serve.

  4. Same Problems – Different Lines of Business • Personal – Auto, HO, Umbrella • Small Commercial – BOP, CPP • Middle Market Commercial – CPP w/GL, CP, Crime, CIM, B&M, WC, Auto • Large Commercial Accounts • Commercial Auto • Workers Comp • Umbrella/Excess • Specialty Lines – D&O, EPL, E&O, Farm, FI

  5. Data Types and Forms Structured data Semi-structured data Unstructured data Text Spatial Pictographic Graphic Voice Video

  6. ACTIONS • Identify Data Systems • Get right data from right systems • Overcome internal Organizational Barriers • Bridge to legacy systems and archived data • Augment to create rich data mining environment • Expect the need to negotiate for resources Multiple Data Systems which must be pulled together for analysis. Great opportunity for cross-validation and sourcing Vendors/Partners Medical Data - Bill Review - PPO - Case Management - Paradigm Archive, Legacy Systems Current SystemClaim Data External Data Policy Multiple Underwriting Systems Multiple States Billing Systems Finance Systems CRM Systems, other data

  7. Some typical external data sources and vendors Dun & BradstreetExperianBureau of Labor and Statistics Market Stance AM Best Equifax US Census Claritas Melissa DataISO GIS vendors U&C Data sets Code Sets for ICD-s and CPT’s …

  8. Data Glitches – historical and on-going Systemic changes to data not process related • Changes in data layout / data types • Changes in scale / format • Temporary reversion to defaults • Missing and default values • Gaps in time series

  9. Process Reasons for poor data entry

  10. Defining Issues-sample Source Data 1-Define Issues

  11. Name: Country Identifiers Context: Definition: Unique ID: 5769 Conceptual Domain: Maintenance Org.: Steward: Classification: Registration Authority: Others DataElementConcept Algeria Belgium China Denmark Egypt France . . . Zimbabwe MORE ISSUES…Mapping across sources: Same Fact, Different Terms Data Elements Algeria Belgium China Denmark Egypt France . . . Zimbabwe L`Algérie Belgique Chine Danemark Egypte La France . . . Zimbabwe DZ BE CN DK EG FR . . . ZW DZA BEL CHN DNK EGY FRA . . . ZWE 012 056 156 208 818 250 . . . 716 Name: Context: Definition: Unique ID: 4572 Value Domain: Maintenance Org. Steward: Classification: Registration Authority: Others ISO 3166 3-Alpha Code ISO 3166 English Name ISO 3166 French Name ISO 3166 2-Alpha Code ISO 3166 3-Numeric Code

  12. Data Filling • Manual • Statistical Imputation • Temporal • Spatial • Spatial-temporal

  13. Geographic Hierarchy

  14. Deriving Data = Power • Totals: Household Income • Trends: Rate of Medical Bill Increases • Ratios: Claims/Premium, Target/Median • Friction: Level of inconvenience, ratio of rental to damage • Sequences: Lawyer-Doctor, Auto-Life Policy • Circumstances: Minimal Impact Severe Trauma • Temporal: Loss shortly after adding collision • Spatial: Distance to Service, proximity of stakeholders • Logged: Progress Notes, Diaries, • Who did it, When, “Why”

  15. Deriving Data = Power (Cont’d) • Behavioral: Deviation from past usage, spike buying • Experience Profiles: Vendor, Doctor, Premium Audit • Channel: How applied, How reported, Service Chain • Legal Jurisdiction: Venue Disposition, Rules • Demographics: Working, Weekly wage, lost income • Firmographics: Industry Class Code Vs Injuries Claimed • Inflation: Wage, Medical, Goods, Auto, COLA • Gov’t Statistics: Crime Rate, Employment, Traffic • Other Stats: Rents, Occupancy, Zoning, Mgd Care

  16. “Search” versus “Discover” Search (goal-oriented) Discover (opportunistic) Structured Data Data Retrieval Data Mining Unstructured Data (Text) Information Retrieval Text Mining

  17. Jimmy Jim James JAMES JAMES JAMES Searching Input Value [Jim] Returns “Similar Matches” All Records Found: Jimmy Jim James Word Replacement Lists TransformedInput Value [JAMES]

  18. Motivation for Text Mining • Approximately 90% of the world’s data is held in unstructured formats (source: Oracle Corporation) • Information intensive business processes demand that we transcend from simple document retrieval to “knowledge” discovery. Structured Numerical or Coded Information 10% Unstructured or Semi-structured Information 90%

  19. Convergence of Disciplines Example

  20. Techniques for attacking text data: • Rules-based • Statistical Text Analysis and Clustering • Linguistic and Semantic Clustering • Support Vector Machines • Pattern Matching or other statistical algorithms • Neural Networks • Combination of methods from above Text is like a data iceberg

  21. Home Office Staff • Field Office Claim Staff • Insured Risk Manager • Agent or Broker • Medical Management Staff • Special Investigation Unit • NICB • Vendor Management • Consulting Engineers • Hearing Representative • Structured Settlement Unit • Recovery Staff • Legal Staff • Diary forward – “call Dr Jones next week” • Business Rule – large loss review • System Reminder – update case reserves • Correspondence Tracking – legal letter sent Claims processing – Progress notes and Diaries Service CLAIMS ADJUSTER

  22. Semantic processing: Named Entity Extraction • Identify and type language features • Examples: • People names • Company names • Geographic location names • Dates • Monetary amount • Phone #, zipcodes, SSN, FEIN • Others… (domain specific)

  23. Feedback to UW

  24. Data Quality:Opportunities, Data, and Examples

More Related