700 likes | 709 Views
Actuarial IQ (Information Quality). CAS Data Management Educational Materials Working Party. Actuarial IQ Introduction. “IQ” stands for “ Information Quality ” Introduction to Data Quality and Data Management being written by the CAS Data Management Educational Materials Working Party
E N D
Actuarial IQ(Information Quality) CAS Data Management Educational Materials Working Party
Actuarial IQ Introduction • “IQ” stands for “Information Quality” • Introduction to Data Quality and Data Management being written by the CAS Data Management Educational Materials Working Party • Directed at actuarial analysts as much as actuarial data managers: • what every actuary should know about data quality and data management
Objectives • Introduce Working Party Paper “Actuarial IQ” • Justify our care about information quality: • industry horror stories • Explain that information quality is not just about input data quality: • depends on all stages: input process, analysis process, output process • Suggest what can be done to improve information quality: • (it starts with YOU!) • Encourage actuaries to become data quality advocates
Why do we care? • Information quality horror stories: • Insolvencies • Run-offs • Mis-pricing • Over-reserving • (Loss of bonuses) • GIRO working party on data quality survey: • On average any consultant, company or reinsurance actuary: • spends 26% of their time on data quality issues • 32% of projectsaffected by data quality problems
Actuarial IQ Key Ideas • Information Quality concerns not only bad data: • information which is poorly processed or poorly presented is also of low quality • Cleansing data helps: • but is just a band aid • There are plenty of things actuarial analysts can do to improve their IQ • There are even more things actuarial data managers can do • Actuaries are uniquely positioned to become information quality advocates
Why Actuaries? • both: • information consumers and information providers • both: • have knowledge of data and high stakes in quality • both: • skillful and influential No one else is better
Working Party Publications • Book reviews of data management and data quality texts in the Actuarial Review starting with the August 2006 edition • These reviews are combined and compared in “Survey of Data Management and Data Quality Texts,” CAS Forum, Winter 2007, www.casact.org • This presentation is based on our • Upcoming paper: “Actuarial IQ (Information Quality)” to be published in the Winter 2008 edition of the CAS Forum
Data Quality vs Information Quality Information quality takes into account • not just data quality • but processing quality • (and reporting quality) DATA + ANALYSIS = RESULTS (Dasu and Johnson)
What is Data Quality? • Quality data is data that is appropriatefor its purpose. • Quality is a relative not absolute concept. • Data for an annual rate study may not be appropriate for a class relativity analysis. • Promising predictor variables in Predictive Modeling may not have been coded or processed with that purpose in mind.
Final Step Decisions Step 0 Data Requirements Data Flow Step 1 Data Collection Information Quality involves all steps: • Data Requirements • Data Collection • Transformations & Aggregations • Actuarial Analysis • Presentation of Results To improve Final Step: • Making Decisions Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Principles on Data Quality: Perspectives Step 1 Data Collection ASB – ASOP 23 – “Data Quality” Step 2 Transformations Aggregations CAS Management Data and Information Committee: “White Paper on Data Quality” Step 3 Analysis Step 4 Presentation of Results Richard T. Watson “Data Management: Databases and Organization”
Final Step Decisions Step 0 Data Requirements ASOP No. 23 Step 1 Data Collection • Due consideration to the following: • Appropriateness for intended purpose … • Reasonablenessand comprehensiveness … • Any known, material limitations … • The cost and feasibility of obtaining alternative data … • The benefit to be gained from an alternative data set … • Sampling methods … Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements White Paper on Data Quality Step 1 Data Collection Evaluating data quality consists of examining data for: • Validity • Accuracy • Reasonableness • Completeness Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Watson Step 1 Data Collection 18 Dimensions of Data Quality: • Many overlap with previously mentioned principles. • Others describe ways of storing data e.g. Representational consistency, Precision • Others go beyond data characteristics to processing and management e.g. Stewardship, Sharing, Timeliness, Interpretation Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Data Requirements Step 1 Data Collection Redman: “Manage Information Chain” • establish management responsibilities • describe information chart • understand customer needs • establish measurement system • establish control and check performance • identify improvement opportunities • make improvements Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Data Requirements Step 1 Data Collection Data Quality Measurement: • Quantify traditional aspects of quality data such as accuracy, consistency, uniqueness, timeliness and completeness using a score assigned by an expert • Measure the consequences of data quality problems • measure the number of times in a sample that data quality errors cause errors in analyses, and • the severity of those errors • Use measurement to motivate improvement Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Metadata Step 1 Data Collection Big help in describing Data Requirements – Metadata! • Data that Describes the Data • Key Data Management Tool Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Example – Marital Status Step 1 Data Collection • What is in the Marital Status Variable? Step 2 Transformations Aggregations Single? Step 3 Analysis Married? Polygamist? Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Example: What is the Marital Status Variable? Step 1 Data Collection Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements What Is In It? Step 1 Data Collection • Business Rules • Data Processing Rules • Report Compilation and Extraction Process • Other Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements What Is In It? Step 1 Data Collection • Business Rules • Data Elements • Definition of Field, e.g., • How Claims are Defined • How Exposure is Calculated • Format of Field • Valid Values and Interdependencies Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements What Is In It? Step 1 Data Collection • Data Processing Rules • How Database is Populated • Sources of Data • Handling of Missing Data Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements What Is In It? Step 1 Data Collection • Report Compilation and Extraction Process • How Data is Selected or Bypassed • Fiscal Period • Accounting Date for Transactions • Actuarial Evaluation Date • Calculations • Mappings Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements What Is In It? Step 1 Data Collection • Other • Process Flow Documentation • Versioning Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Why Actuaries Need Metadata? Step 1 Data Collection • Better Analysis • Avoid Being Mis-Informed about Data Variable and What it Represents • Did Anything Change During the Experience Period Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Example of Metadata Step 1 Data Collection • Statistical Plans in P/C Industry • General Reporting Requirements • Data Element Definitions Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Data Collection Step 1 Data Collection • Data supplier management • Let suppliers know what you want • Provide feedback to suppliers • Balance the following • Known issues with supplier • Importance to the business • Supplier willingness to experiment together • Ease of meeting face to face Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Transformations and Aggregations Step 1 Data Collection • In this step data is put into standardized structures and then combined into larger, more centralized data sets • “Actuarial IQ” introduces two ways to improve IQ in this step: • Exploratory Data Analysis (EDA) • Data Audits Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements EDA: Data Preprocessing Step 1 Data Collection Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements EDA: Overview Step 1 Data Collection • Typically the first step in analyzing data • Purpose: • Find outliers and errors • Explore structure of the data • Uses simple statistics and graphical techniques • Examples for numeric data include histograms, descriptive statistics and frequency tables Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements EDA: Histograms Step 1 Data Collection Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements EDA: Descriptive Statistics Step 1 Data Collection Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements EDA: Categorical Data Step 1 Data Collection Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements EDA: Data Cubes Step 1 Data Collection • Usually frequency tables • Example 1: search for missing gender values Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements EDA: Data Cubes Step 1 Data Collection • Example 2: identify inconsistent coding of marital status Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements EDA: Missing Data Step 1 Data Collection Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements EDA: Summary Step 1 Data Collection • Before data is analyzed, it must be gathered, cleaned and integrated • Techniques from Exploratory Data Analysis can be used to explore the data, and to detect missing values, invalid values and outliers • We have briefly covered histograms, descriptive statistics and frequency tables Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Data Audits Step 1 Data Collection • ASOP No. 23 does not require actuaries to audit, but good to understand • Main Idea: compare the data intended for use to its original source, e.g., policy applications or notices of loss • Top-Down: check that totals from one source match the totals from a reliable source • Bottom-Up: follow a sample of input records through all the processing to the final report Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Models Results Data Final Step Decisions Step 0 Data Requirements Analysis Quality Step 1 Data Collection On its way to results data can be: • Rejected • wrong Format • Underutilized • wrong Model • Distorted • wrong Settings Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results Analysis is a crucial component in the overall process quality
Final Step Decisions Step 0 Data Requirements Model Quality Step 1 Data Collection • Model Design quality • Implementation quality • Testing and Documentation Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Model Quality Step 1 Data Collection • Model Design quality • Model Selection and Validation • Parameters Estimation • Verification Step 2 Transformations Aggregations Step 3 Analysis Did I use the right model ? Did I use themodel right ? Step 4 Presentation of Results • Model Performance
Final Step Decisions Step 0 Data Requirements Model Quality Step 1 Data Collection • Model Performance Models predict observable events. Outcomes can be compared to predictions leading to… • Model’s Improvements • Model’s Recalibration • Model’s Rejection leading to… higher process quality. Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Model Quality Step 1 Data Collection • Model Design quality • Implementation quality • Testing and Documentation Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Model Quality Step 1 Data Collection • Implementation quality • Programming languages: C++, VBA, SQL many books on good design patterns • Formulae in a Spreadsheet - also programming no books on good design patterns • Need good software design to simplify: • Usage • Testing • Modifications / Improvements • Recovery Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results (side benefit)
data algorithm(s) results Final Step Decisions Step 0 Data Requirements Model Quality Results do not belong to the template either Step 1 Data Collection • Implementation quality • Separation of data and algorithms Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Model Quality Each Step on its own tab Step 1 Data Collection • Implementation quality • Layering • simplifies Navigation • optimizes Workflow • shortens Learning Curve Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Model Quality Step 1 Data Collection • Model Design quality • Implementation quality • Testing and Documentation Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Model Quality Step 1 Data Collection • Testing and Documentation • Validation black-box treatment: comparing results with correct ones… • Verification inside-the-box treatment: checking formulae… • Should be integral part of development • Should be performed by outsiders • Should be well-documented Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Model Quality Database “Documenter” or “Diagram Builder” Excel’s “CTRL ~” Displays formulae texts External Data Source definition Comments as Structured Attributes Step 1 Data Collection • Testing and Documentation • Self-documenting features • Database “Documenter” • Excel’s named ranges and expressions • External Data Range definitions • Structured comments Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results
Final Step Decisions Step 0 Data Requirements Model Quality One can link Document Properties to Spreadsheet Cells Step 1 Data Collection • Testing and Documentation • Version management Step 2 Transformations Aggregations Step 3 Analysis Step 4 Presentation of Results