240 likes | 332 Views
Computational Intelligence Dating of the Iron Age Glass. Karol Grudziński Bydgoszcz Academy, Poland Maciej Karwowski University of Rzeszów, Poland Włodzisław Duch Nanyang Institute of Technology, Singapore Nicola u s Copernicus University, Poland. Donors of the Data.
E N D
Computational Intelligence Dating of the Iron Age Glass Karol Grudziński Bydgoszcz Academy, Poland Maciej Karwowski University of Rzeszów, Poland Włodzisław Duch Nanyang Institute of Technology, Singapore Nicolaus Copernicus University, Poland
Donors of the Data • Interdisciplinary Project ‘Celtic Glass Characterization’ Prof. G. Trnka (Institute of Prehistory, University of Vienna); Prof. P. Wobrauschek (Atomic Institute of the Austrian Universities in Vienna)
Data Description (Original Database) • Measurements of chemical compound concentrations using Energy Dispersive X-ray Fluorescence Spectroscopy (26 compounds) • Class – chronological period of manufacturement • LT C1 (La Tene C1, 260 – 170 B.C.) • LT C2 (La Tene C2, 170 – 110 B.C.) • LT D1 (La Tene D1, 110 – 50 B.C.) • 555 glass measurements, usually in 4 points of a single glass object.
Questions to be answered using CI analysis • System capable of automatic dating of glass artifacts given chemical compound concentrations is needed, because there are few experts that can do it. • Exploration of the hidden patterns in the data, with possible implication in archeology through rule extraction analysis (expert archeologist are unable to formalize the knowledge required to predict date). • Corrosion layer is on the surface, broken parts are less corroded. Influence of corrosion on measurements and prediction of the class unknown.
Data Preprocessing • Challenge to classification methods: several vectors for one object, small data. In this case for one glass artifact usually two measurements on each side on the surface and two on the broken parts are included. • Database contains cases with missing class, belonging to other chronological periods, measurements on decorations were excluded.
Numerical Experiments • Three different experiments: • 1st: Both Surface and Broken Side Data • 2nd: Surface Data • 3rd: Broken Side Data • Many algorithms implemented in WEKA, NETLAB, SBL and the GhostMiner packages were used for calculations.
Surface and Broken Side Data • Experiment on the whole preprocessed dataset divided into training and test sets. • Class Distribution (whole set): • 1) LT C1, 29.68% (84 cases) • 2) LT C2, 33.57% (95 cases) • 3) LT D1, 36.75% (104 cases) • 283 cases total, 143 training, 140 test; 1 surface and 1 side measurement (on average) belonging to the same glass object in both training and test sets !
Summary of the Results of the First Experiment • LT C1 well separated. • Naive Bayes works very well. • SBM methods may be very misleading for such data: measurements on the same artifact both in train and test; uncontrolled bootstrap learning.
Logical Rules & Attribute Selection • 1R Tree Rules predict correctly 100/143 training and 93/140 test samples: • 1. IF MnO < 2185.205 THEN C1 • 2. IF MnO [2185.205,9317.315) THEN C2 • 3. IF MnO 9317.315 THEN D1 • Important attributes: MnO + • TiO2, Fe2O3, NiO, Sb2O3, ZnO (LT C1), • Fe2O3, TiO2, NiO, PbO (LT C2) • TiO2, Sb2O3, Fe2O3, PbO, ZnO (LT D1)
Surface data experiment • Experiment on a surface measurement dataset divided into training and test sets. • Class Distribution (whole set): • 1) LT C1, 26.36% (34 cases) • 2) LT C2, 37.98% (49 cases) • 3) LT D1, 35.66% (46 cases) • 129 cases total, 61 training, 68 test, cases belonging to the same artifact distributed into training and test partition.
Logical rules from 1R • 1R rules predict correctly 42/61 (68.9%) training and 43/68 (63.2%) test samples. • 1. IF MnO < 187.34 THEN C1 • 2. IF MnO < [3821.99,9489.09) THEN C2 • 3. IF MnO [187.34, 3821.99) or MnO 9489.09 THEN D1
Logical Rules from C45 C45 rules predict correctly 54/68 (79.4%) test samples 1. IF ZrO2 > 296.1 THEN C1 2. IF Na2O 36472.22 THEN C1 3. IF Sb2O3 >2078.76 THEN C2 4. IF CdO = 0 & Na2O 27414.98 THEN C2 5. IF Na2O > 27414.98 & NiO 58.42 THEN D1 6. IF NiO > 48.45 & CdO = 0 & BaO = 0 & Br2O7 < 53.6 & Fe2O3 12003.35 & ZnO 149.31 THEN D1 7. Default: D1
Logical rules from SSV 1. IF MnO < 1668.47 & ZrO2 > 303.34 THEN C1 2. IF MnO < 1668.47 & ZrO2 < 303.34 & TiO2 < 76.235, or MnO > 1668.47 & Sb2O3 > 986.19, or MnO > 1668.47 & Sb2O3 < 986.19 & CaO < 79370 THEN C2 3. IF MnO > 1668.47 & Sb2O3 < 986.19 & CaO > 79370, or MnO < 1668.47 & ZrO2 < 303.34 & TiO2 > 76.235 THEN D1
Broken Side Data • Experiment on a broken side data divided into training and test partition • Class Distribution (whole set): 1) LT C1, 32.47% (50 cases) 2) LT C2, 29.87% (46 cases) 3) LT D1, 37.66% (58 cases) • 154 cases total, 78 training, 76 test, cases separated
Logical rules from 1R • 1R Rules predict correctly 58/78 (74.4%) training and 57/76 (75.0%) test samples. 1. IF MnO < 2134.61 THEN C1 2. IF MnO [2134.61 9078.525) THEN C2 3. IF MnO 9078.525 THEN D1 • Similar Rules were found by the SSV tree with strong pruning.
Rules from C45 1. IF ZrO2 > 199.38 & CdO = 0 THEN C1 2. IF NiO 62.23 & CaO 114121.35 THEN C1 3. IF CuO 5105.37 & MnO > 2546.77 & ZnO 126.29 THEN C2 4. IF SnO2 > 61.98 & Br2O7 64.08 THEN D1 5. IF Sb2O3 8246.11 & CuO 2042.19 & Al2O3 > 11525.69THEN D1 6. Default C2
Conclusions • CI methods may help to assign samples of uncertain chronology to one of the chronological periods, providing rough logical rules to archeologists. • Important chemical compounds useful for dating have been identified. • Separate tests on the surface and broken side data lead to similar classification accuracies, confirming the hypothesis that corrosion on the surface has minor or no influence on results of the analysis.
Further Work • Larger database is needed. • Detailed predictions by the CI methods should be confronted with archeologists. • There is a significant proportion of unlabeled samples in the original database, unsupervised methods should be applied using reduced feature space.
Acknowledgments • The research on chemical analysis of the archeological glass was funded by the Austrian Science Foundation, project No. P12526-SPR. • We are very grateful to our colleagues from the Atomic Institute in Vienna for making this data available to us.