250 likes | 475 Views
Optimizing CATI Call Scheduling. International Total Survey Error Workshop Hidiroglou, M.A., with Choudhry, G.H., Laflamme, F. Statistics Canada. Outline. Introduction Current CATI workload allocation Optimizing CATI workload Application and results Conclusions Future Work.
E N D
Optimizing CATI Call Scheduling International Total Survey Error Workshop Hidiroglou, M.A., with Choudhry, G.H., Laflamme, F. Statistics Canada Statistics Canada • Statistique Canada
Outline • Introduction • Current CATI workload allocation • Optimizing CATI workload • Application and results • Conclusions • Future Work Statistics Canada • Statistique Canada
1. Introduction Background • Paradata research focussed at improving the current data collection process and practices for CATI surveys • Identified several opportunities for improvement • Implemented some of them (e.g. responsive collection design, time slice, cap on calls, etc.) • Active management used to monitor data collection suggested that resources were not always optimally used throughout collection period Statistics Canada • Statistique Canada
1. Introduction (cont’d) Paradata Sources • Blaise Transaction History (BTH) file: Record created each time an open case is closed • Survey, cycle, Regional Office (RO), interviewer ID • Date, start time and end times of the call • Duration of the call and associated time slice (morning, afternoon, early and late evening shifts) • Outcome code (e.g. complete, appointment, no contact) • Interviewer payroll information: data collection costs • Total payroll hours represents the hours charged to the survey • Historical information since 2003 for all surveys • Updated on a daily basis Statistics Canada • Statistique Canada
1. Introduction (cont’d) Lessons Learned • Significant effort (calls and time spent) on cases for which an interview is not conducted at the first contact • Substantive efforts spent close to the end of data collection yield relatively small marginal returns • Interviewer staffing levels not always optimally allocated with respect to workload and expected productivity • Develop a draft framework to improve the cost-efficiency of data collection Statistics Canada • Statistique Canada
2. Current CATI workload allocation • Many input sources (Excel spreadsheets) used for scheduling interviewers: whole operation is manual • Annual workload: Annual staffing needs are planned for each regional office according to its capacity and the number of interviewers required for all surveys • Monthly scheduling: Determine number of hours for each interviewer according to • type of survey (e.g.: social, business or agricultural) • duration of survey • current period (end of survey, Labour Force Survey week , etc..) • number of interviewers required per day Statistics Canada • Statistique Canada
2. Current CATI workload allocation (cont’d) • Constraints for each interviewer • Time of day: week-day and weekend • Interviewer training • Vacation , sick leave; preferred working hours • Union constraints: minimum and maximum work hours per week • Monthly spreadsheet updated on a daily basis Statistics Canada • Statistique Canada
3. Optimizing CATI workload • Use existing BTH record for given survey • Compute probability of completing a questionnaire for each time slice • Predict these probabilities using linear or logistic regression • Output estimated model parameters • Estimated parameters from linear or logistic regression model • Predicted probabilities of completing a questionnaire • Input predicted probabilities into optimization • Determine optimal number of calls by time slice subject to cost and / or operational constraints Statistics Canada • Statistique Canada
3. Optimizing CATI workload (cont’d) • Regression Models • Linear: • Logistic: • = Average cumulative number of calls up to and including time slice s • xts (time of call: morning, afternoon, early and late evening • = probability of a completed questionnaire within time slice s Statistics Canada • Statistique Canada
3. Optimizing CATI workload (cont’d) • Optimization • Total data collection cost: • t1and t2: unit costs for productive / non-productive calls • : predicted probabilities from the model • Call vector minimizes g subject to • Number of calls for each time slice s is greater than or equal to 0 • Expected response rate equal to a pre-specified response rate R. Statistics Canada • Statistique Canada
4. Application and results • Survey of Labour and Income Dynamics (SLID) • Longitudinal survey, interviews the same people from one year to the next for six consecutive years • Sample size ~ 34,000 ; • Number of calls ~ 400,000 • Response rate: approximately 71% • Measures: Employment and Unemployment Dynamics; Life Cycle Labour Market Transitions; Job Quality; Family Economic Mobility; Dynamics of Low Income; Life Events and Family Changes • Data obtained via CATI over 28 collection days (112 time slices) Statistics Canada • Statistique Canada
4. Application and results (cont’d) Statistics Canada • Statistique Canada
4. Application and results (cont’d) • Regression • Probability of completing questionnaire decreases over time • Intercept and continuous variable significant for all regional offices and regression type (linear /logistic) • Logistic regression provides the better fit: • Best time period for calls depends on regional office: • Late evening • Morning, early and late evening • All time periods good 13
4. Application and results (cont’d) Table 1: Average Absolute relative deviation ( ) defined as average of over times slices with 50 or more calls Statistics Canada • Statistique Canada
4. Application and results (cont’d) Number of calls made- Edmonton
4. Application and results (cont’d) Time spent (minutes) -Edmonton
4. Application and results (cont’d) Number of questionnaires completed -Edmonton
4. Application and results (cont’d) • Optimization Table 2: Savings in terms of time spent Statistics Canada • Statistique Canada
5. Conclusions Model predicted quite well the probabilities for completed questionnaires by time slice Using the predicted probabilities, the optimum schedule showed cost savings without considering the operational constraints Additional constraints will reduce cost savings
5. Conclusions (cont’d) • Workload should be spread uniformly throughout the collection period • Optimum number of calls may vary by time of day • Logistic model slightly better than linear model Statistics Canada • Statistique Canada
6. Future Work (cont’d) • Simulate and optimize collection with multiple surveys and interviewers carried out concurrently • Individual assignment of interviewers to time shifts within the day needs to reflect: • Legal, ergonomic, and operating constraints • Minimum and maximum number of days that interviewers work within the week • Shift duration per day (no more or less than a fixed number of hours), including starting time range of each interviewer • Number of shifts within a day should be reasonable Statistics Canada • Statistique Canada
6. Future Work How do we do above? Extend optimization to include mix of interviewers and surveys Translate the number of calls within each shift and survey into number of required interviewers Use commercial software such as XIMES to account for constraints, and schedule each interviewer by time shift (Gartner, Musliu, and Slany 2001) Gartner, J. Musliu, N., and Slany W. (2001). Rota: a research project on algorithms for workforce scheduling and shift design optimization, AI Communications, 14, 83-92
For more information, please contact • Mike Hidiroglou • mike.hidiroglou@statcan.gc.ca • G. Hussain Choudhry • GhulamHussain.Choudhry@statcan.gc.ca • François Laflamme • francois.laflamme@statcan.gc.ca Statistics Canada • Statistique Canada