370 likes | 494 Views
Fuzzy parameterization for analysis of natural phenomenon and use in other geophysical problems. Pritwiraj Moulik Research Associate & Visiting Science student, Dept. of Earth Sciences, Univ. of Western Ontario, CANADA Undergraduate student, Birla Institute of Technology & Science-Pilani, INDIA.
E N D
Fuzzy parameterization for analysis of natural phenomenon and use in other geophysical problems Pritwiraj Moulik Research Associate & Visiting Science student, Dept. of Earth Sciences, Univ. of Western Ontario, CANADA Undergraduate student, Birla Institute of Technology & Science-Pilani, INDIA Stanford Exploration Project Seminar Stanford University, CA 12th December, 2008
Topics • California fault system • Parkfield Earthquakes: Waveform modeling, GIS • Earthquake nucleation: Pattern Informatics • Taiwan Landslides: Neuro-fuzzy framework • Well log analysis • Prydz Bay: Fuzzy Inference system • Costa Rica convergent margin: Neuro-fuzzy framework • Climate modeling • Monsoon Prediction • Paleo-climatic nonlinea time series analysis
Topics • California fault system • Parkfield Earthquakes: Waveform modeling, GIS • Earthquake nucleation: Pattern Informatics • Taiwan Landslides: Neuro-fuzzy framework • Well log analysis • Prydz Bay: Fuzzy Inference system • Costa Rica convergent margin: Neuro-fuzzy framework • Climate modeling • Monsoon Prediction • Paleo-climatic nonlinea time series analysis
Earthquakes: The unsolved questions….. • Location of earthquakes: nucleation • Magnitude • Self organization & Ergodicity? • Precursory phenomenon ?
California fault system : Parkfield region • Aims: • Characterize similarities in waveforms: GIS • Model the waveforms: Fuzzy membership functions Lessons learnt… • Similar magnitude and rupture extent: fault segmentation • Long-term non randomness of earthquakes: remarkably similar in size and location of rupture, albeit not in epicentre or rupture
Geospatial Analysis: Overview Why Parkfield? The 1934, 1966 and 2004 Parkfield earthquakes used to arrive at this model are remarkably similar in size and location of rupture, albeit not in epicenter or rupture propagation direction (Bakun & McEvilly (1979), Bakun et al. (2005)). • Filter and cluster the voluminous seismic data • System Constraints: • Earthquake Parameters : similar faulting mechanism, magnitude, rupture direction and have occurred on the same fault segment or the same epicenter. Lower variability may be achieved if events are further constrained to have the same rupture time history and distribution of slip. • Source and Station Characteristics : geological setting of the station, the source and the path of propagation is also a major consideration. Details • Earthquake data used: 1934, 1966, 2004 Parkfield earthquake [COSMOS] • Conversion to Excel, then used in ArcMap 9.1 • Soil layer data: NRCS, DEM data: CGIAR-CSI • System constraints used: hypocenter parameters, station/event parameters and sensor description.
Geospatial Analysis: Results • City Recreation Bldg-864 Santa Rosa, San Luis Obispo had records of both earthquakes. • Comparative analysis of both earthquakes: Average S.D.= 0.00693728 cm/s^2, visual similarity
Algorithm : Process I (Clustering) • Input: Seismic data identified by geospatial analysis • Aim: To model the general waveform pattern from an active seismic zone. • Three Processes: • Clustering • Membership Function development • Evolutionary Algorithm Details • A graph, specific to an instant from the onset of P waves, is plotted between acceleration and magnitude of the corresponding earthquake. • The clustering algorithm used in the process is Ward’s Method • The process is repeated to find curves for every instant in the P-S interval.
Algorithm : Process I (Clustering) • Input: The incoming value of acceleration at that station is fed as input • Output: earthquake magnitudes and corresponding membership grades • Only magnitude: corresponding membership grade is greater than 0.8 • Cumulative from t1 to t2 • Above a threshold: 0.73, tested for the earthquakes • Limitations • Data dependent • Membership function development : computationally intensive
Topics • California fault system • Parkfield Earthquakes: Waveform modeling, GIS • Earthquake nucleation: Pattern Informatics • Taiwan Landslides: Neuro-fuzzy framework • Well log analysis • Prydz Bay: Fuzzy Inference system • Costa Rica convergent margin: Neuro-fuzzy framework • Climate modeling • Monsoon Prediction • Paleo-climatic nonlinea time series analysis
The Pattern Informatics Method • The PI index is an analytical method for quantifying the spatiotemporal seismicity rate changes in historic seismicity (Tiampo et.al.,2002). • The observed seismicity activity rate ψobs(xi,t) : proxy for the energy release, earthquakes per unit time (M>Mcutoff )within the box centred at xi at time t. • The average seismicity function S(xi,t0,t) over the time interval (t-t0) is defined as: • The mean –zero, unit-norm function, obtained by deducting the average and dividing by the standard deviation is defined thereafter as: • Physically, the important changes in seismicity are given by : • The final calculation involves averaging over all the base years, t0, to reduce the effects of noise. • The PI index, which represents the time-independent background, is denoted by:
Pertinent questions… • Optimal Temporal Regions? • Magnitude of forecasted earthquake? • Cutoff magnitude to filter? • Threshold PI for hotspots?
Target magnitude & Cutoff magnitude The success of a forecast is based on maximizing the fraction of earthquakes that occur in alarm cells and minimizing the fraction of alarm cells that do not result in earthquakes.
Identify optimal temporal regions in a catalog • The TM fluctuation metric measures effective ergodicity, or the difference between the time average of a quantity and its ensemble average over the entire system(Thirumalai et al., 1989). • Identify the regions of parameter space which exhibit stationary nature and thereby give an optimal forecast (Tiampo et al., 2003, 2007).
Optimal forecast for California • Bin size, dX = 0.1 • Target forecasting magnitude, Mtarget=5.1 • Threshold PI for binary forecast = 0 for the used bin size • Catalog magnitude cutoff , Mc=3.1 • tb=1932, t1=1968, t2=1986, t3=2004, where t2-t3 is the forecasting interval
Ongoing work… • Inversion model for forecasting magnitudes of future earthquakes • Rupture area from PI (Tiampo, 2007) • Fault segmentation • PI value of the hotspot
Topics • California fault system • Parkfield Earthquakes: Waveform modeling, GIS • Earthquake nucleation: Pattern Informatics • Taiwan Landslides: Neuro-fuzzy framework • Well log analysis • Prydz Bay: Fuzzy Inference system • Costa Rica convergent margin: Neuro-fuzzy framework • Climate modeling • Monsoon Prediction • Paleo-climatic nonlinea time series analysis
Landslide prediction • Aim: to formulate and validate a neuro-fuzzy framework and compare with other empirical approaches • Study Area: • Taiwan: circum-Pacific seismic belt • Fractured rock mass along jighways • Heavy rainfall • Previous work (Lee et. al., 1996; Lu, 2001;Chang, 2005) • Typically in weathered soils at low elevation data • Happened at different • slope grades • Slope heights • Slope shapes • Geological formations
Framework synopsis • Parameters • Topographic • Grade • Height • Aspect • Shape • Geological • Formation • Thickness of soil layer
Results MSA (75.29%) Neuro-Fuzzy (86.47%) ANN (80.62%) Frequency distribution of output (1-Landslide, 0-No landslide)
Topics • California fault system • Parkfield Earthquakes: Waveform modeling, GIS • Earthquake nucleation: Pattern Informatics • Taiwan Landslides: Neuro-fuzzy framework • Well log analysis • Prydz Bay: Fuzzy Inference system • Costa Rica convergent margin: Neuro-fuzzy framework • Climate modeling • Monsoon Prediction • Paleo-climatic nonlinea time series analysis
Objective and the studied region • The identification of groundwater, oil and gas formation lithology from well log data largely depends on expert experience and some subjective rules: “if the natural gamma ray reading is high and the separation between shallow formation resistivity and deep formation resistivity is small, then the formation lithology is probably shale (Chapellier, 1992).” • The well logging data from ODP Leg 188 boreholes site - 1166A and 1165C were taken as the case study for the present work (O’Brien et al. 2001)
Modeling Parameters • Input Variables used: • Porosity • Gamma ray • Bulk density • Transit time interval • Resistivity difference • Linguistic terms: very low (VL), low (L), medium (M), high (H), and very high (VH) • Output variables: sand (%), gravel (%) and major soil component size (MSCS) H->clay, M->silt, and L->sand • Characterization of diamicts, gravels/ conglomerates and breccias modified after Moncrieff (1989)
Input & Output trapezoidal membership functions Abbreviation: POR: porosity log; GR: gamma ray log; DEN: bulk density; ΔT: Compressional transit time interval; ΔR: separation between phasor deep induction and spherically focused resistivity Log; MSCS: major soil component's size; (N/A): rule did not use this component after system training.
Comparison between true lithology and fuzzy lithology 1-Diamictite, 2- Clay/Silt, 3- Sand 1165C 1166B
Performance analysis • 80% training data; 20% testing data • Borehole site 1166A: • Training performance: 214 training data sets were identified correctly from the total 258 training data sets with a success rate of 82.95% • Testing Performance: 57 test data sets were predicted correctly from the total of 65 testing data sets (Fig. 7) with an accuracy of 87.69% • This technique is also capable of providing significant lithology information, where core recovery is incomplete. • Core analysis provides a more subjective interpretation but well log analysis may easily: • define a permeable sand formation • distinguish between silts and sands • determine grain size variation in sands. • Error due to • heterogeneous and/or anisotropic conditions existing at this depth between the two wells that resulted in the wrong prediction and • Some factors that were not considered in this study such as photoelectric log, which may provide another perspective.
Conclusions • Natural systems show evidence of imprecise parameters which may be modeled using Fuzzy Parameterization • Earthquake fault systems show nucleation and ergodicity: may help in better forecasts using fuzzy logic • Landslide prediction parameters are inherently imprecise and the best modeled using fuzzy parameterization • Well log analysis may be made more subjective while incorporating the expertise of the analyst using a Inference engine • There are limitations in each application which may be considered before using the paradigm.
THANK YOU!!! • Stanford Exploration Project & Stanford Geophysics • Mentors, collaborators and supervisors • Kristy Tiampo – University of Western Ontario • Gerhard Pratt – Queen’s/ UWO • J. Srinivasan – Indian Institute of Science • Der-Har Lee – NCKU, Taiwan • K. Srinivasa Raju – BITS-Pilani • Upendra K. Singh – Indian School of Mines, Dhanbad • Data Sources • COSMOS, CGIAR-CSI, NRCS, ANSS, ODP, NCKU QUESTIONS….