130 likes | 335 Views
Weather Mining. Hayato Akatsuka. Objective. Cluster a region which shares similar climate. Input. Each weather station in the United States is an input Each station contains more than 50 parameters i.e. Latitude, Longitude, Elevation, Minimum Temperature, Maximum Temperature, so on….
E N D
Weather Mining Hayato Akatsuka
Objective • Cluster a region which shares similar climate.
Input • Each weather station in the United States is an input • Each station contains more than 50 parameters • i.e. Latitude, Longitude, Elevation, Minimum Temperature, Maximum Temperature, so on…
Stations • 6000 ~ 19000 Stations
Overview Input (text file) Station1 2005/01/01 MaxTemp MinTemp Lantitude Longitude Elevation ….Station2 2005/01/01 MaxTemp MinTemp Lantitude Longitude Elevation ….Station3 2005/01/01 MaxTemp MinTemp Lantitude Longitude Elevation ….. output(Image) Clustering
Distance Measure • Euclidean Distance If you are interested in some particular parameters, adjust k accordingly
About Clustering • Day 1(Hierachical Clustering) • This is an initialization Stage. • Pick a number of clusters • Then, Perform Hierarchical Clustering • Day 2(Clustering variant) • For each input, cluster with the nearest centroid obtained from the previous day (Day 1 in this case). • Do not update centroid • Repeat until you cluster all the input for Day 2. • Recalculate centroid • Day 3 • Repeat Day2 ….
Centroid Calculation • For same cluster 2nd Day: 3rd Day: 4th Day:
Quick Animation Day2 Day1
Result • For simplicity, just use only 1 parameter (TMIN). Number of Clusters = 5
Comparison Output Hardiness Zone
Conclusion • Well… there are not much different between a map I received from January and one from December. • Simply making a map out of annual data, instead of daily data, might be better.
Reference • Hardiness Map http://www.arborday.org/treeinfo/zonelookup.cfm