380 likes | 477 Views
Access to and Add Value of Archived Data - Methodology of Data Integration and Mining for 1:1M Land Type Mapping of China. Prof. Liu Chuang Prof. Shen Yuancen Global Change Information and Research Center IGSNRR/Chinese Academy of Sciences PPF-WSIS Phase II, 14 November 2005, Tunis.
E N D
Access to and Add Value of Archived Data - Methodology of Data Integration and Mining for 1:1M Land Type Mapping of China Prof. Liu Chuang Prof. Shen Yuancen Global Change Information and Research Center IGSNRR/Chinese Academy of Sciences PPF-WSIS Phase II, 14 November 2005, Tunis
1 China’s Scientific Data Sharing Program • 2 Opportunities and Challenges: Access to • and Add Value of the Archived Data • 3 Methodology of Adding Value of Archived • Data • Example: • 1:1M Land Type Mapping of China
China’s Scientific Data Sharing Program • China has an implementation program in enhancing open access to scientific data, a national long-term (2005-2020) program: Scientific Data Sharing Program (SDSP) which is initialed in 2003 • About 40 data centers, 300 major databases covering almost all of the basic sciences will be long term supported, a series of data policies and data standards will be established to meet the needs of open access to the archived data.
Besides, e-Government programs in agencies of China and e-Sciences program in CAS will promote the scientific data sharing program greatly. For example, the quick response system of water resources management system.
About 250 TB data archived with the standard or near standard manners in China (June 2005)
2 Opportunities and Challenges: Access to • and Add Value of the Archived Data • The progress makes great opportunities for scientists in research: • the location of data • the way to access • free or low costs
Two Major Challenges in China: • Preservation and open access: more stable, more open, more fast, more easy and more low cost in services, which is a long way to go • Add Value: new methodology in data integration and mining, which is a new way to be created
3 Methodology of Adding Value of Archived Data The value of scientific data can be divided into: value for scientific research value for social benefit value for economic income
Relationship between data value and data integration/mining value Dataset 3 Dataset 2 Dataset 1 time
Reference Hierarchical Model for Data Integration and Data Mining knowledge Cal/Val Computational Process Innovated Ideas/Society Needs Object Simulating Data Integration model Data Selection data Distributed Information Infrastructure
Data Selection: two important issues in this stage(1) how to select the necessary data among the distributed data holders in order to meet the need of modeling for a specific objective(2) how to determine the weights of each selected datasets
Data Integration: one issue, very difficult issue, in this stage has to be solved- making the selected datasets compatibleincluding data standard, termination, definition, format, unit, resolution, time period, method of capture the data ….
Object simulating: two issue, the critical issues, in this stage need to be solved- establish a relationship between the datasets selected (model)- determine the parameters in the model
Cal/Val for the new dataset: How the new dataset qualitycould be: - how quality is or what conditions the new dataset or knowledge could be high quality?- Are there any way to help the dataset quality enough?
New knowledge/new dataset createdgo to publication and data archiving process
Reference Hierarchical Model for Data Integration and Data Mining knowledge Cal/Val Computational Process Innovated Ideas/Society Needs Object Simulating Data Integration model Data Selection data Distributed Information Infrastructure
Example: Data Integration and Mining for 1:1M Land Type Mapping of China
Land type research and 1:1M mapping in China There is a long history in China in land type studies, the earlier record in 170 BC, identified the China land into 9 types. The most resent land type studies in 1:1M mapping started in 1987, the first land type classification system for 1:1M mapping of China created in 1990 led by Prof. Zhao Songqiao. landtypeclaSytemChina.doc
The stage of completed part of the 1:1M Land Type Map of China
Datasets: • The datasets used in this paper include: • Climate datasets in more than 600 climate stations from CMA • Soil map in 1:1M from CAS • MODIS-NDVI/EVI, 250m, 1kmresolution, 16-day and 10 days composite 2002, from NASA and CAS • MODIS-NDSI, 1 km resolution, 10 days and monthly composite 2002, from CAS • SRTM in 90 Meters in USGS and DEM in 1:250k from Geomatic Center of China • Ground truth survey datasets in Northeast China, Inner Mongolia, Tibet, Gansu, Zhejiang, Guizhou … • historical records including documentation and maps from CAS • yearbooks of agriculture and land use from Statistic Bureau of China
MODIS-NDVI 16-days composite datasets, 2002, 1km • Field sites
NDVI = (MODIS2-MODIS1)/ (MODIS2+MODIS1) EVI = 2.5*(MODIS2-MODIS1)/(MODIS2+6*MODIS1- 7.5*MODIS3+1) NDSI = (MODIS4-MODIS6)/(MODIS4+MODIS6)
Forest (Betula) 0 NDVI 0.83 Single peak Location: Far East Russia and Daxingan Mountain in Helongjian Province
Location: Great Hinggan Mt. Forest (Larix+Betula, up) Meadow steppe (down)
Location: Huang-Huai-Hai Plain Rotated crops land with winter wheat and maize
Location: North Korea Forest (purple) Rice (white)
Wetland (reed) 0 NDVI 0.53 0 EVI 0.42 Location: Yellow River Delta
Temperate Meadow 0 NDVI 0.8 Temperate Steppe 0 NDVI 0.6 Temperate Meadow 0 NDVI 0.6 Temperate Steppe 0 NDVI 0.4 Location: Xilingol, Inner Mongolia
Sand Steppe 0 NDVI 0.45 Temperate Desert 0 NDVI 0.25 Temperate Desert Steppe 0 NDVI 0.2 Sand Steppe 0 NDVI 0.35 Location: Xilingol, Inner Mongolia
Location: Coastal area in Northern Jiangsu province Wetland 0 NDVI 0.52 0 EVI 0.35
Location: Qinghai Province Alpine Meadow
Gobi in arid region in northwestern China Location: MinQin County, Gansu Province Gobi
Location: MinQin County (Oasis), Gansu Province Spring Wheat Crop Land
Location: Gongbujiangda area located at the Eastern Tibet June 2001 August 2001 April 2001
Location: Nyainqntanglha Mountains NDSI >0.4 and MODIS2 > 0.11 Up left: Feb.2002 Up right: June 2002 Down left: Sep. 2002
Conclusion: The reference Hierarchical mode of data integration and mining is very important for innovated knowledge development, the computational science plays a critical role in the new methodology. The new methodology in data integration and mining will take China land type studies into a new milestone.