1 / 23

Metamorphic Testing for Classifiers in Machine Learning

Explore the application of Metamorphic Testing to supervised classifiers in machine learning to tackle non-testable programs. Utilize metamorphic properties to guide transformation functions for testing without having a complete test oracle.

Download Presentation

Metamorphic Testing for Classifiers in Machine Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Application of Metamorphic Testing to Supervised Classifiers Xiaoyuan Xie, Tsong Yueh Chen Swinburne University of Technology Christian Murphy, Gail Kaiser Columbia University Joshua Ho University of Sydney Baowen Xu Nanjing University

  2. Background • Many applications in the field of scientific computing depend on machine learning (ML) algorithms • ML applications often do not have test oracles that indicate whether the output is correct for arbitrary input • Applications without test oracles are called “non-testable programs”

  3. Problem Statement • Oracles may exist for a limited subset of the input domain, and gross errors (e.g. crashes) can be detected with certain inputs or techniques • However, it is difficult to detect subtle (computational) errors for arbitrary inputs

  4. Testing ML Applications • There has been much research into applying ML techniques to software testing, but not the other way around • Reusable real-world data sets and frameworks are available for checking that an ML algorithm predictswell, but not for checking that an implementation workscorrectly

  5. Observation • If there is no oracle in the general case, we cannot know the expected relationship between a particular input and its output • However, it may be possible to know relationships between a set of inputs and the corresponding set of outputs • “Metamorphic Testing” [Chen et al. ’98] is such an approach

  6. Metamorphic Testing • An approach for creating follow-on test cases based on previous test cases • If input x produces output f(x), then the function’s “metamorphic properties” are used to guide a transformation function t, which is applied to produce a new test case input, t(x) • We can then predict the expected value of f(t(x)) based on the value of f(x) obtained from the actual execution

  7. Metamorphic Testing without an Oracle • When a test oracle exists, we can know whether f(t(x)) is correct • Because we have an oracle for f(x) • So if f(t(x)) is as expected, then it is correct • When there is no test oracle, f(x) acts as a “pseudo-oracle” for f(t(x)) • If f(t(x)) is as expected, it is not necessarily correct • However, if f(t(x)) is not as expected, either f(x) or f(t(x)) (or both) is wrong

  8. Metamorphic Testing Example • Consider a program that reads a text file of test scores for students in a class, and computes theaverages and the standard deviation of the averages • If we permute the values in the text file, the results should stay the same • If we multiply each score by 10, the final results should all be multiplied by 10 as well • These metamorphic properties can be used to create a “pseudo-oracle” for the application

  9. Approach • To apply Metamorphic Testing to such ML applications, we first enumerate the metamorphic relations based on the expected behaviors of a given machine learning algorithm • We then utilize these relations to conduct metamorphic testing on the implementation

  10. Verification & Validation • The scope of which metamorphic properties are necessary may differ between various problems in the domain • Properties that are necessary can be used for verification: “Is the implementation of the algorithm correct?” • Other properties can be used for validation: “Is the algorithm appropriate for solving this problem?”

  11. Research Questions • What are the metamorphic properties of supervised ML classification algorithms? • Which can be used for verification? • Which can be used for validation? • Can metamorphic testing detect defects in real-world ML applications?

  12. Machine Learning Fundamentals • Data sets consist of a number of samples, each of which has attributes and a label • In the first phase (“training”), a model is generated that attempts to generalize how attributes relate to the label • In the second phase, the model is applied to a previously-unseen data set (“testing” data) with unknown labels to produce a classification of each sample

  13. Algorithms Investigated • k-Nearest Neighbors (kNN) • Samples in the testing data are classified by using Euclidean distance to find the k nearest samples in the training data • Classification is then done by majority rule • Naïve Bayes Classifier (NBC) • For a given sample in the testing data, computes the probability of that sample belonging to each class, assuming conditional independence between the attributes • Chooses the class that is most likely

  14. Metamorphic Relations • We identified 11 properties that we would expect all classification algorithms to have • Affine transformation of attributes • Permutation of labels or attributes • Addition of informative or uninformative attributes • Addition of classes by duplicating or re-labeling samples • Removal of classes or samples

  15. Experimental Setup • Applied the approach to implementations in the Weka 3.5.7 toolkit • Initial test cases: • Randomly generated values • Four attributes (“columns”) • 20-50 samples (“rows”) • Metamorphic relations were applied to create 20-300 follow-on test cases

  16. Results k Nearest Neighbors Naïve Bayes Classifier

  17. Analysis: kNN • No necessary properties were violated • Issues related to validation: • Labels that are non-existent in the training data have a non-zero chance of being selected in classification • If two labels are equally likely, the “first” one that is listed is chosen

  18. Analysis: Naïve Bayes • Four necessary properties were violated, indicating defects in the implementation • Loss of precision related to use of the “double” datatype in Java • Laplace Accuracy used to determine probabilities; thus, labels that did not appear in training data have non-zero probability

  19. Suggestions • We suggest using the “BigDecimal” class instead of the “double” datatype • Laplace Accuracy is appropriate for the attributes but not for the labels • Use of Laplace Accuracy should be set as an option

  20. Future Work • Apply the testing approach to other domains that depend on ML, such as scientific computing • Further investigation of testing “non-testable programs” • Measure the effectiveness of the approach in empirical studies

  21. Summary • Metamorphic testing is easy to implement and automate • We were able to devise fault-revealing properties even with just a basic understanding of the ML algorithms • Metamorphic testing can be used for both verification and validation

  22. Application of Metamorphic Testing to Supervised Classifiers Xiaoyuan Xie, Tsong Yueh Chen Swinburne University of Technology Christian Murphy, Gail Kaiser Columbia University Joshua Ho University of Sydney Baowen Xu Nanjing University

  23. Related Work • Applying MT to non-testable programs in other domains • General properties for use in MT

More Related