1 / 29

3D Geological Modeling: Solving as Classification Problem with Support Vector Machine

Earth Sciences Sector. 3D Geological Modeling: Solving as Classification Problem with Support Vector Machine. Groundwater. A. Smirnoff, E. Boisvert, S. J.Paradis. Objectives. Find an algorithm for automating the 3D modeling procedure from sparse data Test the algorithm on available data

pbenson
Download Presentation

3D Geological Modeling: Solving as Classification Problem with Support Vector Machine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Earth Sciences Sector 3D Geological Modeling: Solving as Classification Problem with Support Vector Machine Groundwater A. Smirnoff, E. Boisvert, S. J.Paradis

  2. Objectives • Find an algorithm for automating the 3D modeling procedure from sparse data • Test the algorithm on available data • Make conclusions about its applicability

  3. Possible Input Data • Well data • Surface geology maps • Cross-section data • Can be used alone or in combination

  4. Algorithms Currently in Use and Their Limitations • Voronoi diagrams • Potential fields • Normally require too much information and/or additional procedures • What if we only have a few sections to start with?

  5. Given a set of points in 3D with known geological information • For the rest of points in reconstruction space, information is not available • Based on known points, classify the rest into known number of units (classes) Unit 1 Reconstruction Space Unit 2 3D Reconstruction as a Classification Problem

  6. Available Classification Methods • Bayesian classification • a priory knowledge of probabilities • Nearest-Neighbor classifiers • extremely sensitive to parameter choice and scaling • Decision trees • not flexible with many samples • Neural networks • slow and difficult to use • Support Vector Machine (SVM) • relatively new method • becoming more and more popular

  7. SVM Algorithm • Input: Take a set of training samples with known features and classes • Model: Build a model (boundary) separating the training samples • Output: Classify any new (unclassified) or test samples using the model

  8. 1. Original 2. Training set 3. Output Z Y X Binary Reconstruction

  9. Input Data: • Total points: 389235 • Training Set: 17452 (4.48%) - 2 units on 11 sections • Points to be classified: 371783 Results: • Total classified: 371783 • Success: 361909 (97.34%) • Failure: 9874 (2.66%) Input Data and Results

  10. Training Sections Section5 Section10 Section11 Section4 Section2 Section3 Section6 Section7 Section8 Section9 Section 1 100 90 80 70 60 Success Rate (%) 50 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 All Model Sections Detailed Analysis (Class 1)

  11. Peeking into the SVM Black Box • A simple case: two classes and two features (e.g., length of petal and sepal in flowers) • Training Set: known data vectors : xi, wherei = 1, …., l

  12. Linearly separable data • Which linear separator is the best? • V.Vapnik (1995) suggested maximum margin Maximum Margin 1 1< 3< 2 Class: +1  10 10 2 3 1/2 9 9 8 8 7 7 Support Vectors 6 6 Feature 2 Feature 2 5 5 1/2 Class: -1 4 4 1/2 3 3 2 2 1 1 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 Feature 1 Feature 1 Maximum Margin Separating Hyperplane (MMSH)

  13. If wTx+b = 0 is separating hyperplane: • Decision function: f(x) = sign(wTx+b), x is a test sample Class: +1 wT xi + b >0 HMSH xi xi 10 9 xi x3 8 xi 7 xi wT x + b = 0 Class: -1 xi 6 x1 Feature 2 wT xi + b < 0 5 xi 4 3 x2 xl 2 xi 1 xi xi 1 2 3 4 5 6 7 8 9 10 Feature 1 Hard Margin Classification-HMSH

  14. For wTx+b = 0 consider a pipe defined by: • Then: or yi (wTxi+b) 1 • Maximize distance between: wTx+b 1 Maximize Distance Class: +1 wT xi + b >+1 xi xi 10 x3 9 xi 8 xi wT x + b = +1 7 Class: -1 wT x + b = -1 xi xi 6 wT xi + b < -1 Feature 2 5 wT x + b = 0 xi 4 x1 3 x2 xl 2 xi 1 xi xi 1 2 3 4 5 6 7 8 9 10 Feature 1 How to Maximize the Margin?

  15. Distance between : wTx+b 1 is given as: • Then: xi xi 10 x3 • Or: • Quadratic optimization problem • Solution exists 9 xi 8 xi wT x + b= +1 7 wT x + b= -1 xi xi 6 Feature 2 5 wT x + b = 0 xi 4 x1 3 x2 xl 2 xi 1 xi xi 1 2 3 4 5 6 7 8 9 10 Feature 1 Problem Formulation

  16. Data are noisy, not easily separable • Allow classification errors by introducing slack variable: • Support vectors: ones with distance ½  from SMSH + misclassified ones xi  xi  xi 10 x3 9 Support Vectors xi 8 xi 7 • Thus: • Where C – cost or penalty parameter xi HMSH xi 6 xi x1 Feature 2 5 SMSH xi 4 3 x2 xl 2 xi 1 xi xi 1 2 3 4 5 6 7 8 9 10 Feature 1 Soft Margin Classification - SMSH

  17. Data are separable or separable with some noise – no problem (HMSH or SMSH) • What if data is not linearly separable in data space? • Find a function to re-map data into a higher-dimensional space (feature space) where it is separable e.g., xR1->  R2 f(x) Class: +1 Class: -1 x 0 x 0 Non-Separable Data

  18. f(x) Data (Input) Space R1 Class: +1 1. Problem 3. Solution x x 0 Class: +1 0 Class: -1 Feature Space R2:  (x) = (x, x2) x2 Class: +1 2. Solution x 0 Class: -1 Non-Linear SVM

  19. How to find the function in more complicated situation? • We do not need to explicitly know the function! • Formulation and solution of optimization problem use only inner products of vectors • Kernel function  inner product of some function in its feature space • Thus the final decision function is: f(x) = ΣTx + b (i weighing factors i >0 only for support vectors) K(xi,x)= φ(xi)Tφ(x) f(x) = Σαiyi K(xi,x) + b Kernel Trick

  20. Known kernel functions: linear, polynomial, radial-basis function (RBF), etc. • The RBF is the most general form of kernel: • The decision function then: • The only adjustable kernel parameter is  K(xi,xj) = f(x) = Σαiyi + b Kernel Functions

  21. Using geological units as classes • Using X, Y, Z coordinates as features • Using non-linear SM SVMwith RBF kernel • Using LIBSVM from National University of Taiwan • Only two parameters to control: C and  • Selecting parameters is a black art, done on try and see basis • Simple grid search with validation is recommended e.g., C=2-8, 2-7, …, 215;  = 2-15, 2-14, …, 212 How Did We Use SVM?

  22. All Experiments: lg () 9 8 7 6 5 4 lg () 12 11 10 9 8 7 6 5 4 3 2 1 0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 -14 -15 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 lg (C) Proposed Range: C=2-3- 215;  = 24- 29 - Best Binary Result (97.79% at C=21, =26) - Previous Example (97.34%) -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 lg (C) C and Grid Search

  23. Low C, High  High C, High  Avg C, Avg  Low C, Low  High C, Low  Influence of C and 

  24. 1 - Organic 2 - Littoral 3 - Clay 4 - Esker 5 - Till 6 - Bedrock Multi-Class Classification 1. Original 2. Training set 3. Output Z X Y

  25. Data Statistics and Results

  26. 100 Bedrock 90 80 Esker 70 Clay 60 50 Success(%) Till 40 Littoral 30 20 Organic 10 0 0.01 0.1 1 1 10 TrainingPointsperClass(%) Success per Class

  27. Area 1.00E+09 Bedrock Clay Till Reconstructed 1.00E+08 Esker Organic Littoral 1.00E+07 1.00E+07 1.00E+08 1.00E+09 Volume Original 1.00E+11 Bedrock 1.00E+10 Till Clay Reconstructed 1.00E+09 Esker Littoral 1.00E+08 Organic 1.00E+07 1.00E+07 1.00E+08 1.00E+09 1.00E+10 1.00E+11 Original Area and Volume Comparison

  28. Conclusions • The SVM can successfully be used in single and multi-unit 3D geological reconstructions: • Reasonable results are obtained with just a few training sections • Parameters must be picked from the range: C=2-3- 215;  = 24- 29 • Low C values - less details, more generalized model • High C values - more details, less generalized model • Additional Experiments Demonstrated: • Number of units can vary (all units must be represented in training set) • Sections can be arbitrarily located • Other types of information (well data, surface geology maps) can be used

  29. References • Abe, S., 2005. Support Vector Machines for Pattern Classification. Springer-Verlag, London, 343 pp. • Cristianini, N., Shawe-Taylor, J., 2000. Support Vector Machines. Cambridge University Press, 189 pp. • Vapnik, V., 1995. The Nature of Statistical Learning Theory. Springer-Verlag, New York, 311 pp.

More Related