1 / 19

Peiman Pazhoheshfar Young Researchers Club, Azad University of Tafresh,Iran P.Pazhohesh@gmail

Penalized Trimmed Squares and Quadratic Mixed Integer Programming for Deleting Outliers in Fuzzy Liner Regression. Peiman Pazhoheshfar Young Researchers Club, Azad University of Tafresh,Iran P.Pazhohesh@gmail.com Eleventh International Conference on Fuzzy Set Theory and Applications

Download Presentation

Peiman Pazhoheshfar Young Researchers Club, Azad University of Tafresh,Iran P.Pazhohesh@gmail

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Penalized Trimmed Squares and Quadratic Mixed Integer Programming for Deleting Outliers in Fuzzy Liner Regression Peiman Pazhoheshfar Young Researchers Club, Azad University of Tafresh,Iran P.Pazhohesh@gmail.com Eleventh International Conference on Fuzzy Set Theory and Applications (FSTA 2012)

  2. Outline • Introduction • Fuzzy regression models • A mathematical Programming Approach • Quadratic mixed integer programming for penalized trimmed squares (PTS) • Numerical Example • Conclusion

  3. 1- Introduction • The use of statistical linear regression is bounded by some strict assumptions about the given data • Fuzzy regression is introduced which is an extension of the conventional regression and is used in estimating the relationships among variables.

  4. 1- Introduction • The goal of FR analysis is to find a regression model that fits all observed fuzzy data within a specified fitting criterion • Two approaches of FR: • Minimizing fuzziness as an optimal criterion • Simplicity in programming and computation • Provide too wide ranges in estimation which could not give much help in application • 2. Least squares of errors as a fitting criterion to minimize the total square error of the output.

  5. 1- Introduction In fuzzy linear regression models data often contain outliers and bad influential observations. If the data are contaminated with a single or few outliers the problem of identifying such observations is not difficult. Detection of outliers can identify system faults and fraud before they escalate with potentially catastrophic consequences. Instrument error Mechanical faults Outliers Human error Changes in system behavior

  6. 2- Fuzzy regression models 𝛼𝑗 is its central value and 𝑐𝑗 is the spread value. 𝛼𝑗 - 𝑐𝑗 𝛼𝑗 𝛼𝑗 + 𝑐𝑗

  7. 2- Fuzzy regression models • are supposed to be non-negative, because the fuzziness in estimated intervals usually increases for larger values of independent variables • The results are s scale dependent and many might equal to zero • Total Vagueness of the given data should be minimize • To repair this problem, replacement for sum of spreads of FR model’s coefficients, sum of spreads of the estimated intervals can be used as an objective function

  8. 2- Fuzzy regression models • Each H-certain estimated interval is needed to involve the corresponding H-certain observed interval. • This affects in large coefficient spreads j c if any dependent variable has large spreads je or if there are outliers.

  9. 3- A mathematical programming approach • Penalized Trimmed Squares PTS: • PTS is defined by minimizing a convex objective function (loss function), which is • the sum of squared residuals and penalty costs for discarding bad observations. • The robust estimate is obtained by the unique optimum solution of the convex • mathematical formula called QMIP • Assumptions: • Crisp Input • Crisp Output • Relation between input and output = Fuzzy function

  10. 3- A mathematical programming approach M observations k(Clean data) M-k (outliers) • The basic idea is to insert fixed penalty costs into the loss function for possible deletion. • Only observations that produce reduction larger than their penalty costs are deleted from the data set. • The proposed PTS estimator minimizes: • Sum of the k square residuals in the clean data • Sum of the penalties for deleting the rest observations.

  11. 3- A mathematical programming approach 𝛼𝑗 - 𝑐𝑗 𝛼𝑗 𝛼𝑗 + 𝑐𝑗

  12. 3- A mathematical programming approach The above analysis leads to the following quadratic programming problem: The value cσ can be interpreted as a threshold for the allowable size of the residuals.

  13. 3- A mathematical programming approach • The constant c is well known from robust cut-off parameter , and it will be a cut-off parameter between data outlier and prediction vagueness. • 2.5σ or 3σ is a reasonable threshold under Gaussian conditions. • The penalty cost is defined a priory and the estimator’s performance is very sensitive to this penalty which regulates the robustness and the efficiency of the estimator. • The term (𝑐σ)2can be interpreted as a penalty cost for deleting any observation where σ is a robust residual scale, and c is a cut-off parameter.

  14. 3- A mathematical programming approach • Construct a regression estimator that has high breakdown point combined with good efficiency. • For this purpose appropriate penalties for high-leverage observations are developed : • Unmask the multiple outliers • Delete bad high-leverage outliers whereas keeping all of good high-leverage points

  15. 4- Quadratic mixed integer programming for PTS (𝑋0, 𝑌0), (𝑋1, 𝑌1)… (𝑋𝑀, 𝑌𝑀) Robust penalties =(𝑞𝜎)2 << If 𝛿𝑖 = 1 the residual is reduced to zero the loss function is penalized with(𝑞𝜎)2 >>

  16. 5- Numerical example • How the proposed method performs in fuzzy regression analysis in comparison with other methods ? • This example has fuzzy observations only for dependent variable. • Example • Tanaka et al. (1989) designed an example to illustrate their regression model for dealing with the problem of crisp independent variable and fuzzy dependent variable. • Diamond(1988) • Kim and Bishu (1998) • Savice and Pederyzc (1991) • Kao. C, Chyu. C.-L., (2003) • Nasrabadi et al. (2003)

  17. 5- Numerical example In the study of Tanaka et al. (1989), three types of fuzzy regression models: Min problem, Max problem, and Conjunction problem, were discussed. For he sake of simplicity, the results of the Min problem at h = 0 is used for comparison.

  18. 6- Conclusion New methodology for deleting outliers in liner fuzzy regression is presented which reduces the problem to a quadratic mixed integer program. The approach is shown to perform well when compared to other models in fuzzy regression literature.

  19. Thanks for your attention

More Related