1 / 32

The Impact of Sample Bias on Consumer Credit Scoring Performance and Profitability

Overview. IntroductionMethodology Data DescriptionEmpirical studyConclusionFurther research. The Impact of Sample Bias on Consumer Credit Scoring Performance and Profitability. . Application Scoring as opposed to behavioral/profit scoringSample Bias - ?Population drainage'

gyda
Download Presentation

The Impact of Sample Bias on Consumer Credit Scoring Performance and Profitability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    3. Overview Introduction Methodology Data Description Empirical study Conclusion Further research

    4. Application Scoring as opposed to behavioral/profit scoring Sample Bias - ‘Population drainage’ - Biased estimates Reject Inference Application scoring: Binary classification of good versus bad classes Comparison with a threshold and discretizationApplication scoring: Binary classification of good versus bad classes Comparison with a threshold and discretization

    5. Calibration set Creating a ‘proportional’ sample 3 Research Questions: Q1: Does sample bias occur ? Q2: What is the impact of sample bias ? Q3: Proportionality versus sample size Calibration set: a set of orders accepted by the credit score but were accepted on a judgmental basis (“low side overrides”) Useful through inclusion in: the model building process the holdout sample, which can be designed to be more representative of the ‘trough-the-door’ applicant population => We will refer to this as a ‘proportional’ sample, i.e. a sample with the same ratio of orders accepted versus rejected according to the existing credit score, but where all outcomes are known. Q1: a) model trained on accepted orders only and tested on accepted orders versus the calibration set b) does a variable selection procedure select different variables when a proportional sample is used Q2: a) quantify the gain in predictive performance and profitability if the outcome of all rejected orders would be available b) most important part of the study => sensitivity analysis to check the external validity of our results. Q3: If there is a gain in performance, would it then be optimal to sample a small part of the rejected orders and use a proportional sample? In practice, this would often mean that, in order to create a proportional sample, sample size will be reduced. In this last research question we investigate whether a reduction in sample size (in order to gain proportionality) is beneficial. Ps: the sample size will be reduced whenever alpha > sigma (acceptance rate > percentage of rejects for which the outcome is available) calculations can be found in the paperCalibration set: a set of orders accepted by the credit score but were accepted on a judgmental basis (“low side overrides”) Useful through inclusion in: the model building process the holdout sample, which can be designed to be more representative of the ‘trough-the-door’ applicant population => We will refer to this as a ‘proportional’ sample, i.e. a sample with the same ratio of orders accepted versus rejected according to the existing credit score, but where all outcomes are known. Q1: a) model trained on accepted orders only and tested on accepted orders versus the calibration set b) does a variable selection procedure select different variables when a proportional sample is used Q2: a) quantify the gain in predictive performance and profitability if the outcome of all rejected orders would be available b) most important part of the study => sensitivity analysis to check the external validity of our results. Q3: If there is a gain in performance, would it then be optimal to sample a small part of the rejected orders and use a proportional sample? In practice, this would often mean that, in order to create a proportional sample, sample size will be reduced. In this last research question we investigate whether a reduction in sample size (in order to gain proportionality) is beneficial. Ps: the sample size will be reduced whenever alpha > sigma (acceptance rate > percentage of rejects for which the outcome is available) calculations can be found in the paper

    6. Logistic Regression Performance measurement PCC, AUC and profitability Why a logit model: one of the most frequently used techniques in research and industry Traditional statistical models, such as logistic regression perform very vell for credit scoring when compared to machine learning techniques (B. Baesens, JORS, June 2003). Another technique often used in credit scoring, discriminant analysis, has been proven to introduce bias when used for extrapolation beyond the reject region (Hand and Henley, IMA journal of mathematics applied in business and industry, 1994, Feelders, International Journal of Intelligent Systems in Accounting and Finance, 2000, Eisenbeis, Journal of Finance, 1977. Performance measurement: PCC, AUC, Profit PCC = TP+TN/TP+TN+FP+FN, but it tacitly assumes equal misclassification costs for false positive versus false negative predictions, and class distributions are presumed constant and relatively balanced. Evaluation for one specific cutoff, not for a range of cutoffs, whereas auc is a measure that summarizes the classifier’s performance over various values . ROC chart: Y as = Sensitivity (TP/(TP+FN=those predicted to be positive)) & X as = 1 – Specificity (TN/FP+TN) A 2-dimensional graphical illustration of the sensitivity versus the specificity for various values of the classification threshold. The area under this curve provides a simple figure of merit for the performance of the constructed classifier (cf. Gini coefficient: 2 x the area between the curve and the diagonal)Why a logit model: one of the most frequently used techniques in research and industry Traditional statistical models, such as logistic regression perform very vell for credit scoring when compared to machine learning techniques (B. Baesens, JORS, June 2003). Another technique often used in credit scoring, discriminant analysis, has been proven to introduce bias when used for extrapolation beyond the reject region (Hand and Henley, IMA journal of mathematics applied in business and industry, 1994, Feelders, International Journal of Intelligent Systems in Accounting and Finance, 2000, Eisenbeis, Journal of Finance, 1977. Performance measurement: PCC, AUC, Profit PCC = TP+TN/TP+TN+FP+FN, but it tacitly assumes equal misclassification costs for false positive versus false negative predictions, and class distributions are presumed constant and relatively balanced. Evaluation for one specific cutoff, not for a range of cutoffs, whereas auc is a measure that summarizes the classifier’s performance over various values . ROC chart: Y as = Sensitivity (TP/(TP+FN=those predicted to be positive)) & X as = 1 – Specificity (TN/FP+TN) A 2-dimensional graphical illustration of the sensitivity versus the specificity for various values of the classification threshold. The area under this curve provides a simple figure of merit for the performance of the constructed classifier (cf. Gini coefficient: 2 x the area between the curve and the diagonal)

    7. Logistic Regression Performance measurement PCC, AUC and profitability Resampling procedure Stratified resampling Sensitivity analysis To what degree does the extent of truncation influence the results ? Resampling procedure: Assess the variance of performance indicators by splitting the data into train and validation set (stratified sampling allocating an equal percentage of defaulters to training and holdout sample). This procedure will be performed 100 times. Sensitivity analysis: The value of reject inference is driven by the extent of truncation, ie the size of the reject region. Since we have historical scores, we can simulate the situation if only the best 70% of the orders (instead of 86.8%) would have been accepted, considering Hand & Henley’s (Journal of the Royal Statistical Society 1997) observation that 70% is not unusual in mail-order consumer credit.Resampling procedure: Assess the variance of performance indicators by splitting the data into train and validation set (stratified sampling allocating an equal percentage of defaulters to training and holdout sample). This procedure will be performed 100 times. Sensitivity analysis: The value of reject inference is driven by the extent of truncation, ie the size of the reject region. Since we have historical scores, we can simulate the situation if only the best 70% of the orders (instead of 86.8%) would have been accepted, considering Hand & Henley’s (Journal of the Royal Statistical Society 1997) observation that 70% is not unusual in mail-order consumer credit.

    8. Furnival & Wilson (1974) Leap-and-Bound algorithm De Long, De Long & Clarke-Pearson (1988) Comparing AUC’s We have used the leap-and-bound algorithm of Furnival and Wilson, Technometrics 1974, implemented in selection = score option in the sas logistic procedure to detect the best model for all possible model sizes. The algorithm requires a minimum of arithmetic, and the possibility for finding the best subsets without examining all possible subsets. However, it generates a likelihood score (chisq) statistic without significance tests, so we used the algorithm proposed by De Long, De Long and Clarke-Pearson, Biometrics 1988 to investigate whether a model with a given sample size significantly differs in terms of AUC from the full model. We then selected the model with the lowest number of variables that does not differ significantly from the model using all characteristics at a 5% significance level. We have used the leap-and-bound algorithm of Furnival and Wilson, Technometrics 1974, implemented in selection = score option in the sas logistic procedure to detect the best model for all possible model sizes. The algorithm requires a minimum of arithmetic, and the possibility for finding the best subsets without examining all possible subsets. However, it generates a likelihood score (chisq) statistic without significance tests, so we used the algorithm proposed by De Long, De Long and Clarke-Pearson, Biometrics 1988 to investigate whether a model with a given sample size significantly differs in terms of AUC from the full model. We then selected the model with the lowest number of variables that does not differ significantly from the model using all characteristics at a 5% significance level.

    9. Belgian Catalog Retailer Orders between 1/7/2000 and 1/2/2002 Variable creation Demographics, occupation, financial information, and default information Scoring process Catalog retailer offering consumer credit to its customers. Articles as diverse as furniture, electronics, gardening & DIY equipment and jewelry. This analysis was performed on a moment when the previous score – constructed by an international company specialized in consumer credit scoring – was to be updated since it had been in use for 6 years. Orders between July 1st 2000 and February 1st 2002, but follow-up until February 1st 2003, so the outcome could be tracked for all orders. Variables were inspired on the ideas of the company’s managers as well as previous research Dependent variable: third reminder: (i) customer is then charged for his delay (ii) reminder really urges customer to pay and (iii) historically been used by the company. Profitability of the order was not used, since the class distribution got even more skewed then, which degraded the performance of all models severely. Variables: it’s a strategic decision to limit the info required upon application. Nevertheless, we computed 45 variables for this study, which can be found in the appendix of the full paper.Catalog retailer offering consumer credit to its customers. Articles as diverse as furniture, electronics, gardening & DIY equipment and jewelry. This analysis was performed on a moment when the previous score – constructed by an international company specialized in consumer credit scoring – was to be updated since it had been in use for 6 years. Orders between July 1st 2000 and February 1st 2002, but follow-up until February 1st 2003, so the outcome could be tracked for all orders. Variables were inspired on the ideas of the company’s managers as well as previous research Dependent variable: third reminder: (i) customer is then charged for his delay (ii) reminder really urges customer to pay and (iii) historically been used by the company. Profitability of the order was not used, since the class distribution got even more skewed then, which degraded the performance of all models severely. Variables: it’s a strategic decision to limit the info required upon application. Nevertheless, we computed 45 variables for this study, which can be found in the appendix of the full paper.

    10. Automatic scoring procedure and an independent manual selection procedure. A rather large set was investigated ‘manually’ regardless of their score. Manual acceptance overrulesAutomatic scoring procedure and an independent manual selection procedure. A rather large set was investigated ‘manually’ regardless of their score. Manual acceptance overrules

    11. We coded 1 as defaulting orders, 0 as non-defaulting orders, so the p-value is a defaulting probability: the higher, the more risky R3: of no important: rejected for strategic and/or legal reasons. Since these rules are long-term rules, these orders are of no importance for future credit scoring in the company. Accepted by score: 36039 Rejected by score: 5471 => acceptance rate of 86.8%. Of the rejected orders, 36.7 % are overrides In more than 95% of the orders rejected by the score, the orders were handled by a judgmental processWe coded 1 as defaulting orders, 0 as non-defaulting orders, so the p-value is a defaulting probability: the higher, the more risky R3: of no important: rejected for strategic and/or legal reasons. Since these rules are long-term rules, these orders are of no importance for future credit scoring in the company. Accepted by score: 36039 Rejected by score: 5471 => acceptance rate of 86.8%. Of the rejected orders, 36.7 % are overrides In more than 95% of the orders rejected by the score, the orders were handled by a judgmental process

    12. Does sample bias occur ? 1A: Does a classifier trained on accepted orders only prove to be more erroneous on the calibration sample ? Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups)Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups)

    13. Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups)Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups)

    14. Average AUC difference 0.0812 points t Value 48.02 p < 0.0001 Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups)Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups)

    15. Does sample bias occur ? 1A: Does a classifier trained on accepted orders only prove to be more erroneous on the calibration sample ? 1B: Does a classifier trained on accepted orders only lead to the selection of other variables ? Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups)Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups)

    16. In order to detect whether sample bias influences the variable-selection process, we compared the variable selection process on a sample of 50% of the accept region (2) with a proportional sample (1 and 3)In order to detect whether sample bias influences the variable-selection process, we compared the variable selection process on a sample of 50% of the accept region (2) with a proportional sample (1 and 3)

    17. Model sizes All variables: 45 Selected variables: 31 Overlap: 24 Difference: 7 Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups) By coincidence, in both models the model with 31 variables was selected Now the interesting part follows. The degree to which the different variables influence credit scoring performance and profitability will now be investigatedData partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups) By coincidence, in both models the model with 31 variables was selected Now the interesting part follows. The degree to which the different variables influence credit scoring performance and profitability will now be investigated

    18. What is the impact of sample bias on credit scoring performance and profitability for a given sample size ? Actual setting (86.8 % accepted) Sensitivity analysis (70 % accepted) Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups) In order to enhance the comparison between the actual setting and the sensitivity analysis, we always presented the results side by side.Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups) In order to enhance the comparison between the actual setting and the sensitivity analysis, we always presented the results side by side.

    19. Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups) Holdout sample is proportional: so we test the real-life situation (4 and 5) Comparisons with a given sample size: 1+2 versus 1+3: only accepted orders compared to accepted and rejected ordersData partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups) Holdout sample is proportional: so we test the real-life situation (4 and 5) Comparisons with a given sample size: 1+2 versus 1+3: only accepted orders compared to accepted and rejected orders

    20. Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups) As a sensitivity analysis, we recreate the situation if there would have been an acceptance rate of 70%. Hence, a sample of 6833 orders with the highest default probabilities was appended to the calibration sample, whereby the impact of the previous calibration sample is reduced to 22% of the current sample, and the influence of the manual selection procedure is drastically reduced. The holdout sample is again proportional to the through the door population, ensuring the 70 % proportionality. Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups) As a sensitivity analysis, we recreate the situation if there would have been an acceptance rate of 70%. Hence, a sample of 6833 orders with the highest default probabilities was appended to the calibration sample, whereby the impact of the previous calibration sample is reduced to 22% of the current sample, and the influence of the manual selection procedure is drastically reduced. The holdout sample is again proportional to the through the door population, ensuring the 70 % proportionality.

    21. p . a + (1 – p ) . (1 – a ) (Morrison, 1969) The accuracy of a random model is defined by: p = the true proportion of refunded orders a = the proportion of applicants that will be accepted for credit, Morrison JMR 1969The accuracy of a random model is defined by: p = the true proportion of refunded orders a = the proportion of applicants that will be accepted for credit, Morrison JMR 1969

    22. The second model clearly outperforms the other models in terms of PCC, but the impact on PCC that can be reached by including the calibration sample in a proportional way, seems to be low (0.0003), especially when compared to the difference resulting from the update of the model (0.0042).The second model clearly outperforms the other models in terms of PCC, but the impact on PCC that can be reached by including the calibration sample in a proportional way, seems to be low (0.0003), especially when compared to the difference resulting from the update of the model (0.0042).

    23. The results are again completely analoguous to the PCC results, which strongly confirms the former observations: 1) the second model is the best model in terms of AUC, and the third performs significantly worse 2) the improvement of performance between models 2 and 1 seems relatively small when compared to the difference resulting from the update of the model. The results are again completely analoguous to the PCC results, which strongly confirms the former observations: 1) the second model is the best model in terms of AUC, and the third performs significantly worse 2) the improvement of performance between models 2 and 1 seems relatively small when compared to the difference resulting from the update of the model.

    24. Considering confidentiality, we do not reveal absolute profit information, but we represent the relative profit changes that exist. In terms of Profitability, it is clear that, both in the actual setting as the sensitivity analysis, model 3 outperforms the other models. However, it must clearly be stated that the maximal improvement that can be made considering perfect reject inference is only 1 to 3 % higher in this setting. Additionally, this does not keep into account the costs that should be incurred by determining the outcome of a sample of the orders that would normally be rejected, or the time cost for applying a reject inferencing procedure. Again, it is confirmed that profit results differ from classification performance results, and it would be up to management to decide upon the model that optimally meets the business objectives. The profit gain from including the calibration sample rises as the reject region becomes more important, which was not tested, but it seems logical that the impact of sample bias grows when the bias itself grows.Considering confidentiality, we do not reveal absolute profit information, but we represent the relative profit changes that exist. In terms of Profitability, it is clear that, both in the actual setting as the sensitivity analysis, model 3 outperforms the other models. However, it must clearly be stated that the maximal improvement that can be made considering perfect reject inference is only 1 to 3 % higher in this setting. Additionally, this does not keep into account the costs that should be incurred by determining the outcome of a sample of the orders that would normally be rejected, or the time cost for applying a reject inferencing procedure. Again, it is confirmed that profit results differ from classification performance results, and it would be up to management to decide upon the model that optimally meets the business objectives. The profit gain from including the calibration sample rises as the reject region becomes more important, which was not tested, but it seems logical that the impact of sample bias grows when the bias itself grows.

    25. Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups) In the previous sample composition, we clearly notice that a large part of the data (21800 orders) were not used in order to compare the situation of equal sample sizes. However, here, we will enlarge the homogeneous sample with this part to investigate whether increasing sample size of the homogeneous sample reduces the benefits due to proportionality. Ps: the sample size will be reduced whenever alpha > sigma (acceptance rate > percentage of rejects for which the outcome is available) calculations can be found in the paper Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups) In the previous sample composition, we clearly notice that a large part of the data (21800 orders) were not used in order to compare the situation of equal sample sizes. However, here, we will enlarge the homogeneous sample with this part to investigate whether increasing sample size of the homogeneous sample reduces the benefits due to proportionality. Ps: the sample size will be reduced whenever alpha > sigma (acceptance rate > percentage of rejects for which the outcome is available) calculations can be found in the paper

    26. Does a trade-off exist between proportionality and sample size ? Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups)Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups)

    27. Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups) This group of 21800 orders was not used as such, but split randomly into 50 samples of 436 orders each, where the average amount of defaulters was 7.6 per sample. These samples were added successively. Data partitioning repeated 100 times with a stratified sampling procedure (allocating an equal percentage of defaulters to both groups) This group of 21800 orders was not used as such, but split randomly into 50 samples of 436 orders each, where the average amount of defaulters was 7.6 per sample. These samples were added successively.

    28. Comparison of the model with accepted orders only with a proportional model (but variables selected based on the accepted orders only), if we add the 50 data samples to the sample of accepted orders. The positive slope here means that, as the size of the homogenous sample rises, the quality of the model increases. The point on the intersection with the vertical axes (most to left) is the situation of equal sample sizes that was discussed before. There, we saw earlier that the model with accepted orders only performed significantly worse than the proportional model (indicated by a black mark as opposed to a nonsignificant white mark). Yet as sample size of the homogeneous sample increases, this difference quickly becomes insignificant, and when all data is added, the quality of the model with only accepted orders is clearly higher than the proportional model. The second graph: we saw earlier that, with equal sample sizes, pcc performance was not significantly different if we used the variables selected on the proportional sample. Also, the quality of the model built on accepted orders only rises, and becomes significantly higher than the proportional sample as sample size of the homogeneous sample increases.Comparison of the model with accepted orders only with a proportional model (but variables selected based on the accepted orders only), if we add the 50 data samples to the sample of accepted orders. The positive slope here means that, as the size of the homogenous sample rises, the quality of the model increases. The point on the intersection with the vertical axes (most to left) is the situation of equal sample sizes that was discussed before. There, we saw earlier that the model with accepted orders only performed significantly worse than the proportional model (indicated by a black mark as opposed to a nonsignificant white mark). Yet as sample size of the homogeneous sample increases, this difference quickly becomes insignificant, and when all data is added, the quality of the model with only accepted orders is clearly higher than the proportional model. The second graph: we saw earlier that, with equal sample sizes, pcc performance was not significantly different if we used the variables selected on the proportional sample. Also, the quality of the model built on accepted orders only rises, and becomes significantly higher than the proportional sample as sample size of the homogeneous sample increases.

    29. Again, AUC results are very comparable and a little more stable than PCC graphs. This confirms that, in terms of classification accuracy, sample size is more important than sample bias.Again, AUC results are very comparable and a little more stable than PCC graphs. This confirms that, in terms of classification accuracy, sample size is more important than sample bias.

    30. Again, the profitability results show a different picture. The graph on the right here is important, because this was the best model in the previous analysis in terms of profitability. It is the case that adding more data to the homogeneous sample of accepted orders does improve the profitability slightly, but never up to the point where the profitability is higher than the profitability of the proportional sample. Again, this seems to indicate that the impact of sample bias is a little higher in terms of profitability than in terms of predictive accuracy, although the impact is still modest (around 1% of profits).Again, the profitability results show a different picture. The graph on the right here is important, because this was the best model in the previous analysis in terms of profitability. It is the case that adding more data to the homogeneous sample of accepted orders does improve the profitability slightly, but never up to the point where the profitability is higher than the profitability of the proportional sample. Again, this seems to indicate that the impact of sample bias is a little higher in terms of profitability than in terms of predictive accuracy, although the impact is still modest (around 1% of profits).

    31. Effect of Sample Bias Significant yet modest improvements Predictive performance differs from profitability - Impact of the inclusion of calibration set in training sample and variable selection procedure Proportionality vs sample size In contradiction with other studies, we have not proposed a reject inferencing technique, but we estimate the maximal improvement that could be reached if the reject inference procedure was flawless. Modest improvements, especially when compared to the improvements reached due to updating the model and creating new variables, and when the cost of gaining such a sample should be accounted for (they are an upper limit for improvement). It seems at least counter intuitive that a mere expansion of the homogeneous sample can cover, at least in terms of predictive accuracy the lack of calibration sample. To conclude the effect of proportionality prevails, and enhancing proportionality can lead to improvements in classification accuracy and profitability. However, at least in this mail-order credit setting, the resulting benefits of any possible reject inferencing techniques are low.In contradiction with other studies, we have not proposed a reject inferencing technique, but we estimate the maximal improvement that could be reached if the reject inference procedure was flawless. Modest improvements, especially when compared to the improvements reached due to updating the model and creating new variables, and when the cost of gaining such a sample should be accounted for (they are an upper limit for improvement). It seems at least counter intuitive that a mere expansion of the homogeneous sample can cover, at least in terms of predictive accuracy the lack of calibration sample. To conclude the effect of proportionality prevails, and enhancing proportionality can lead to improvements in classification accuracy and profitability. However, at least in this mail-order credit setting, the resulting benefits of any possible reject inferencing techniques are low.

    32. Direct-mail company - acceptance rate 86.8 % - default percentage 1.94 % - misclassification costs: ratio 2.58 Methodology of sensitivity analysis While we tried to improve the external validity of this study in the sensitivity analysis, the study was executed on the data of a direct-mail company. The cost involved with a defaulter was only 2.58 times higher than the profit gained from a non-defaulter. Further research depends largely on the availability of other credit scoring datasets.While we tried to improve the external validity of this study in the sensitivity analysis, the study was executed on the data of a direct-mail company. The cost involved with a defaulter was only 2.58 times higher than the profit gained from a non-defaulter. Further research depends largely on the availability of other credit scoring datasets.

More Related