Redoing ratio analysis with clustering techniques

Redoing ratio analysis with clustering techniques Kexing Ding Xuan Peng Miklos Vasarhelyi Yunsen Wang

Peerselectionwithclustering • Weapplymachinelearningtechniquesonfinancialratiostoidentifyfinancialreportingpeerfirm. • Financialreportingpeerfirms:firmsthathavesimilarfinancialreportingqualities. • Observation:SICcodesarecommonlyused • Obsolete • Infrequentupdate • Focusonproductionprocess

Financialreportingqualitymodels • Beneish(1997,1999): • Financialreportingqualityisassociatedwitheightratios • Eitherthe accounting distortion or the preconditions that encourage managers to conduct misreporting. • Widelyusedinacademicandinpracticeeversince • Sales in receivables (Receivables/Sales), • Gross Margin (Sales-Costs of Goods Sold/Sales), • Asset Quality (1 - (Current Assets + PPE)/Total Assets), • Sales Growth (Salest/Salest-1), • Depreciation Rate (Depreciation/NetPPE), • SGA Rate (Sales, general, and administrative expenses/Sales), • Leverage (Total Debt/Total Assets), • Accruals (Total Accruals/ Total Assets).

Clustering method • Clustering analysis is one of the data mining methodologies that groups a set of objects in such a way that objects in a same cluster is more similar to each other than to those in the other clusters. • The K-medians algorithm operates on a set (X) of n points. It chooses k centers { from X and form k clusters { that the sum of the distances from each to the center of its clusters ci () is minimized. • An observation with eight financial ratios can be viewed as a point () in an eight-dimensional space. • Everyyear,theclusteringprocedureassignseachpointinoneofthe300clusters.

Results(1) • Clusteringsuccessfullyfindsfirmswithsimilarfinancialratios. • Lowwithin-groupdeviation • Highinter-groupdeviations

Results(2) • Enhanceaccountingfrauddetection model • Dechow(2011)F-scoremodel: • F-score = The predicted probability / the unconditional expectation of accounting fraud • The higher the F-score, the more suspicious • Revised model:

RankthefirmsbasedonF-score: Incorporating firms’ deviation from financial reporting peer firms enhances the ability of F-score to detect fraud firms. Model 1: Dechow (2011) Model Model 2a: Revised Model

1.WerankthefirmsbasedonF-score: The ability of F-score derived from the SIC 4-digit (NAIC 4-digit) classification to detect fraud is similar to the original Dechow et al. (2011) model. . Model 2a: Revised Model with clustering Model 2b and 2c are Revised models with SIC 3-digit and NAIC 4-digit

Type I error is calculated as the percentage of non-fraud observations that are incorrectly predicted as fraud. • Type II error is calculated as the percentage of fraud observations that are incorrectly predicted as non-fraud by the model. • Type1errorreducesfrom40.8%to34.7% • Type2 error reduces from32.4% to28.8%.

Thankyou!

Redoing ratio analysis with clustering techniques

Redoing ratio analysis with clustering techniques

Presentation Transcript

Ratio Analysis

Ratio Analysis

Ratio Analysis

RATIO ANALYSIS

Ratio Analysis

Ratio Analysis

Clustering Techniques

RATIO ANALYSIS

RATIO ANALYSIS

RATIO ANALYSIS

Ratio Analysis

Ratio Analysis

RATIO ANALYSIS

Ratio Analysis

RATIO ANALYSIS

Ratio Analysis

Ratio Analysis

Ratio Analysis

RATIO ANALYSIS

RATIO ANALYSIS

Ratio Analysis