230 likes | 455 Views
Continuity Equations: Analytical Monitoring of Business Processes in Continuous Auditing. Michael G. Alles Alexander Kogan Miklos A. Vasarhelyi Jia Wu 12th World Continuous Auditing Symposium Nov 3-4, 2006. IT-enabled Business Processes (BPs).
E N D
Continuity Equations: Analytical Monitoring of Business Processes in Continuous Auditing Michael G. Alles Alexander Kogan Miklos A. Vasarhelyi Jia Wu 12th World Continuous Auditing Symposium Nov 3-4, 2006
IT-enabled Business Processes (BPs) • A business organization consists of a variety of business processes. • A business process is “a set of logically related tasks performed to achieve a defined business outcome,” Davenport and Short (1990). • Modern information technology makes it possible to measure and monitor business processes at the unprecedented level of detail (disaggregation) on the real-time basis. But currently there is a lack of BP control monitoring. • Continuous auditing (CA) methodology can utilize the IT capability to capture BP data at the source and in the disaggregated and unfiltered form to achieve more efficient, effective and timely audit.
Conventional Analytical Procedure Focus on financial data Audit data are summarized and aggregated. Analytical modeling based on the relationships between financial accounts Ratio Analysis, trend analysis, reasonableness tests CA Analytical Monitoring Focus on business processes data Audit data are unfiltered and disaggregated. Analytical modeling based on the relationship between business processes Continuity equation models Comparison between Conventional Analytical Procedures and CA Analytical Monitoring
Reengineering of Substantive Testing in CA • AP can be used in the planning, substantive testing, and reviewing stages of an audit. We focus on AP in substantive testing. • Conventional auditing: • First, apply analytical procedures to identify potential problems. • Then, focus detailedtransaction testing on the identified problem areas. • CA – the sequence isreversed: • First, apply automated generaltransactiontests to all the transactions and screen out identified exceptions for resolution. • Then, apply automated analytical procedures to the transaction stream to identify unforeseen problems. • Finally, alarm humans to investigate anomalies. (Targeted transaction tests)
Data-oriented Continuous Auditing System Automatic Analytical Monitoring: Continuity Equations Automatic Transaction Verification Anomaly Alarms Exception Alarms Responsible Enterprise Personnel Business Data Warehouse Enterprise System Landscape Materials Management Sales Ordering Accounts Receivable Human Resources Accounts Payable
Data-oriented CA: Automation of Substantive Testing • Automation of Transaction Testing: • Formalization of BP rules as transaction integrity and validity constraints. • Verification of transaction integrity and validity detection of exceptions generation of alarms. • Automation of Analytical Procedures: • Selection of critical BP metrics and development of stable business flow (continuity) equations. • Monitoring of continuity equation residuals detection of anomalies generation of alarms. • This presentation focuses on the automation of APs.
Advanced Analytics in CA: BP Modeling Using Continuity Equations • Continuity equations: • Statistical models capturing relationships between various business processes rather than financial accounts. • Can be used as expectation models in the analytical procedures of continuous auditing. • Originated in physical sciences (various conservation laws: e.g. mass, momentum, charge). • Continuity equations are developed using statistical methodologies of: • Linear regression modeling (LRM); • Simultaneous equation modeling (SEM); • Multivariate time series modeling (MTSM): Vector Autoregressive Model (VAR), Subset-VAR, Bayesian-VAR (BVAR).
Basic Procurement Cycle t2-t1 P.O.(t1) Receive(t2) t3-t2 Voucher(t3)
Inferred Analytical Model (Subset-VAR) of Procurement P.O.(t)= 0.24*P.O.(t-4) + 0.25*P.O.(t-14)+ 0.56*Receive(t-15) + εPO Receive(t)= 0.26*P.O.(t-4) + 0.21*P.O.(t-6)+ 0.60*Voucher(t-10) + εR Voucher(t)=0.54*Receive(t-1) - 0.17*P.O.(t-9) + 0.22*P.O.(t-17) + 0.24*Receive(t-17)+ εV
Steps of Analytical Modeling and Monitoring Using Continuity Equations • Choose essential business processes to model (purchasing, payments, etc.). • Define (physical, financial, etc.) metrics to represent each process: e.g.,$ Amount of purchase orders, quantity of items received, number of payment vouchers processed. • Choose the levels of aggregation of metrics: • By time (hourly, daily, weekly), by business unit, by customer or vendor, by type of products or services, etc.
Steps of Analytical Modeling and Monitoring Using Continuity Equations-II • Identify and estimate stable statistical relationships between business process metrics – Continuity Equations (CEs). • Define acceptable thresholds of variance from the expected relationships. • If the variances (residuals) exceed the acceptable levels, alarm human auditors to investigate the anomaly (i.e., the relevant sub-population of transactions).
How Do We Evaluate CE Models? • Linear Regression Model is the classical benchmark for comparison. • Models are compared on two aspects: • Prediction Accuracy, and • Anomaly Detection Capability.
Prediction Accuracy Comparison: Results Analysis • Mean Absolute Percentage Error (MAPE) is used to measure prediction accuracy. • Prediction accuracy comparison results: • Multivariate Time Series (best). • Linear regression (middle). • Simultaneous Equations (worst). • Difference is small (<2%). • Noise in our data sets may pollute the results. • Prediction accuracy is relatively good for all continuity equation models: • There are studies in which MAPE exceeds 100%.
Simulating Error Stream: The Ultimate Test of CA Analytics • Seed errors of various magnitude into randomly chosen subset of the holdout sample. • Identify anomalies as those observations in the holdout sample for which the variance exceeds the acceptable threshold of variance. • Test whether anomalies are the observations with seeded errors, and count the number of false positives (Type I ERR) and false negatives (Type II ERR). • Repeat this simulation several times by choosing different random subsets to seed errors into.
Measuring Anomaly Detection • False positive error (false alarm, Type I error): A non-anomaly mistakenly detected by the model as an anomaly. Decreases efficiency. • False negative error (Type II error): An anomaly failed to be detected by the model. Decreases effectiveness. • A good analytical model is expected to have good anomaly detection capability: low false negative error rate and low false positive error rate.
Simulated Real-time Error Correction • CA makes it possible to investigate a detected anomaly in (nearly) real-time. • Anomaly investigation can likely correct a detected problem in (nearly) real-time. • Real-time problem correction results in utilizing the actual (not erroneous) values in analytical BP models for future predictions. • Real-time error correction is likely to make subsequent anomaly detection more accurate, and the magnitude of this benefit can be evaluated using simulation.
Error Detection: Aggregated Data vs. Disaggregated Data • In CA the disaggregated data are available. Can the disaggregated data boost anomaly detection performance? • Dimensions for aggregation and disaggregation: • temporal and geographic. • A comparative simulation study of error detection vs. BP metric aggregation has to examine different aggregation patterns of seeded errors: • Best case – aggregated error (e.g., total weekly error seeded in a single day) • Worst case – disaggregated error (e.g., total weekly error is equally partitioned between every day of the week) • Intermediate case – somewhat disaggregated error
Results and Conclusions from Simulation Studies • Various statistical methods can be used to derive expectation models of acceptable quality: • Linear regression is often OK; • Multivariate time series methodology can provide somewhat more accurate models. • Real-time error correction significantly improves error detection capabilities of all models. • More disaggregated models are not always better: weekly data can be more stable than the daily one. • Alarms have to be managed – trade-off between Type I and Type II errors.
Concluding Remarks • New CA-enabled analytical audit methodology: simultaneous relationships between highly disaggregated BP metrics. • How to automate the inference and estimation of numerous CE models? • How to identify and remove outliers from the historical data to estimate statistically valid CEs (step-wise re-estimation of CEs)? • How to choose the confidence level for generating alarms (trade-off between Type I and Type II errors: efficiency vs. effectiveness)? • How to make it worthwhile (is it worth the cost)?