1 / 10

Distributed Genetic Algorithm for feature selection in Gaia RVS spectra

Distributed Genetic Algorithm for feature selection in Gaia RVS spectra. Application to ANN parameterization. D.Fustes , D.Ordóñez , C.Dafonte , M.Manteiga and B. Arcay. Introduction.

cachet
Download Presentation

Distributed Genetic Algorithm for feature selection in Gaia RVS spectra

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DistributedGeneticAlgorithmforfeatureselection in Gaia RVS spectra Applicationto ANN parameterization D.Fustes, D.Ordóñez, C.Dafonte, M.Manteiga and B. Arcay

  2. Introduction • GGG (Galician Group for Gaia): Part of CU8 in DPAC. Involved in classification and parameterization tasks using AI techniques • Work with simulated data of the RVS instrument: • Estimation of physical parameters: • Effective temperatures • Superficial gravities • Metallicities • Abundancies of alpha elements

  3. Gaia RVS simulated data • Library compiled by A. Recio, P. de Laverny and B. Plez • 971 points per spectra. • Different SNR levels: 5,10,50, 200, .. • 70% data to train the Network and 30% to test the model • Use of ANN networks to perform the parameterization

  4. Discrete Wavelet Transform • Redundant filtering process: • High-pass filters to generate Details • Low-pass filters to generate Approximations • Use of level 3 DWT: A3+D3+D2+D1, 997 points

  5. Feature selection • Reduce the spectra to fewer dimensionality • Reduce the complexity of the models • Reduce the computational needs • Variability-based methods: Reduce the dimensionality of a set capturing most of its variability (PCA) • They can not be specialized to capture the features relevant to the estimation of each parameter • Genetic Algorithm to select relevant areas for each parameter

  6. Genetic algorithm • Based on the Evolution’s Theory • Best individuals reproduce and pass to the next generation • Fitness function: Train the ANN, test it and inverse the mean error. Computationally expensive!!!

  7. Distributed computation • Huge computation needs lead to scalable solutions • Multicomputers are cheaper than supercomputers • Ways to distribute the algorithm • Low level: Distribute the ANN computation: • It should be performed in hardware • Medium level: Distribute the ANN learning • Possible with batch learning • Online learning perform better in this case • High level: Distribute the fitness computation • It was implemented in C++ with MPI and OpenMp

  8. Results(1) • SNR 200 • Original spectra

  9. Results(2) • SNR 200 • Wavelet domain

  10. Thank You for your attention!!!Any question?

More Related