290 likes | 298 Views
Proposal replicates for spatially clustered porcesses. Rafal Wojcik, Dennis McLaughlin, Hamed Almohammad and Dara Entekhabi, MIT. Spatially clustered processes are very pervasive in nature. How can we incorporate their intermittent structure into ensemble data assimilation?.
E N D
Proposal replicates for spatially clustered porcesses Rafal Wojcik, Dennis McLaughlin, Hamed Almohammad and Dara Entekhabi, MIT Spatially clustered processes are very pervasive in nature How can we incorporate their intermittent structure into ensemble data assimilation? Can we do more to insure that our estimates are physically realistic? Forest fire, Colorado Midwest thunderstorms (2D space, 1D time) Algae bloom, Washington
Rainfall Data Assimilation – Merging Diverse Observations • Develop Bayesian (ensemble) data assimilation procedures that can efficiently merge remote sensing and ground-based measurements of spatially clustered processes (e.g. rainfall). • These procedures will be feature-based versions of particle filtering/importance sampling or MCMC.
Feature Likelihood Prior Posterior Measurement Bayesian Perspective Extend Bayesian formalism to accommodate geometric features tointegrate prior information w. new measurements : Use ensemble representation: Proposal Relationship between true and measured images: Gives likelihood expression in terms of observation error PDF:
Requirements for feature-based Bayesian Needed for feature-based Bayesian formulation: Generate realistic clustered proposal images • Define observation error probability measure over set of possible error images. Is a relevant measure of similarity between observations and proposal replicates?
How can we define measurement error norm? • should preserve spatially intermittent features of the real process (e.g. rainfall) • metrics used to compare replicates and measurements should be sensitive to clustering. How similar are these images?
Euclidean dist = 4 Euclidean dist = 4 Rain replicate (=1) Meas rain (=1) No rain (=0) Euclidean metric
Image characterization: cluster based image compression Initial cluster centers and scattered rain pixels Neural gas finds “best” locations for cluster centers Center of rain pixel Cluster center xi yi Image is concisely characterized by cluster centers’coordinates (xi,yi)
Image characterization: cluster based image compression NG algorithm identifies 10-D feature vector characterizing each image replicate
Image characterization: cluster based image compression 1 4 1 5 5 4 3 2 3 2 4 4 5 2 2 1 1 5 3 3 POOR RESULTS: Numbering of neural gas centers has strong impact on aggregate distance measure.
Image characterization: Jaccard metric For two binary vectors (images) A and B Jaccard similarity is defined as: A B AA-AB AB BB-AB and Jaccard metric is defined as: AA+BB-AB This can be generalized for real positive vectors using:
Jaccard dist = 0.7 Jaccard dist = 0.8 Rain replicate (=1) Meas rain (=1) No rain (=0) Image characterization: Jaccard metric
Feature Ensembles – Training Images & Priors Multipoint technique identifies patterns within a moving template that scans training image Training image Template Template patterns Number of times each template pattern occurs Pattern probability Replicate generator
Replicate generation -- Unconditional simulation Replicates Measurement Training image rain/no rain probabilities + cluster size distribution preserved
Conditional simulation Replicates Measurement Training image Conditional ensembles approach analogous to “nudging” (van Leeuven, 2010)
Constructing ensembles of proposal replicates for Bayesian estimation How do we generate a moderate-sized proposal (or prior) that properly represents uncertainty in the measurement while including a reasonable number of replicates that are "close" to the true image? measurement truth
Constructing ensembles of proposal replicates for Bayesian estimation Conditional (1% of pixels) Conditional (5% of pixels) 500 replicates Conditional (20% of pixels)
Conditional ensemble (1% of pixels) – sorted using Jaccard metric Measurement JACCARD DISTANCE BEST WORST WORST
Conditional ensemble (5% of pixels) – sorted using Jaccard metric Measurement JACCARD DISTANCE BEST WORST
Conditional ensemble (20% of pixels) – sorted using Jaccard metric Measurement JACCARD DISTANCE BEST WORST
Conclusions Clustered processes require a feature-based approach to Bayesian estimation which does not rely on Gaussian assumptions. One option is to use importance sampling over the space of possible features. This requires that we 1) generate appropriate proposal images and 2) define an observation error probability measure based on an appropriate norm. The Jaccard metric is a promising choice for this norm that orders differing images in an intuitive fashion. Conditional multi-point random field generators can be used to produce realistic clustered proposal replicates Future work will combine these ideas to obtain a feature-based procedure for rainfall data assimilation
Proposal Replicates for Spatially Clustered Processes Rafal Wojcik , Dennis McLaughlin, Hamed Almohammad and Dara Entekhabi, MIT, U.S.
Long-term objectives Proposal ensemble generator MAP estimate Measurements Update Truth Microwave LEO satellite (e.g. NOAA, TRMM, SSMI) Radar (e.g. NEXRAD) Geostationary satellite (e.g. GOES) Feature preserving data assimilation scheme Rain gage
Short-term objective • Identify ways to characterize and generate random ensembles of realistic spatially clustered replicates (images) for ensemble-based data assimilation • These procedures will be feature-based versions of particle filtering/importance sampling or MCMC. Replicate 4 Replicate 3 Replicate 1 Possible alternatives – summer rain storms Replicate 2 …..
Common assumption in particle filters: Is a relevant measure of similarity between observations and proposal replicates?
texture (rain intensity) within support boundary of feature support Geometric aspects of a typical NEXRAD summer rainstorm Image characterization How do we describe a feature ? -- Discretize over an n pixel grid Feature support 2n possible features Feature represented as a vector of pixel values Feature support + texture ∞ possible features boundary of clouds no rain rain