10 likes | 134 Views
European Geosciences Union, General Assembly 201 4 Vienna, Austria, 2 8 April – 2 Ma y 201 4. A Spatially Dependent Normalization of Kernels in Adjoint Tomography Kubina 1 , F ., Moczo 1,2 , P ., Kristek 1,2 , J.
E N D
European Geosciences Union, General Assembly 2014 Vienna, Austria, 28 April – 2 May2014 A Spatially Dependent Normalization of Kernels in Adjoint Tomography Kubina 1, F., Moczo1,2, P., Kristek1,2, J. 1 - Comenius University in Bratislava, Slovakia 2 - Slovak Academy of Sciences, Bratislava, Slovakia Introduction Kernels are computed as time integrals of the product of a forward-field variable v(r,t) and adjoint-field variable w*(r,t). Singularities of field variables in continuum (e.g., strain at a point source) are approximated by very large values of the field variables near the source (forward or adjoint). Consequently, the absolute values of kernel near the source are very large too. A model change is obtained as a product of a kernel and some appropriate step. If at some point, say P, the absolute value of kernel is very large, the step must be correspondingly verysmall if a reasonable model change at that point should be obtained. Small step lengthimplies very small model changes at points with absolute values of the kernel considerably smaller than that at P. Consequently the misfit will decrease, although the model itself is changed only very little. We may reach some local minimum, while a major part of resulting model is still far from the true model. One way to avoid this problem is truncation of the large kernel values. Although the new (model change) direction after truncation is not a direction of the misfit gradient anymore, it is still sufficiently close to the original direction and thus applicable in the gradient methods. After truncation, the maximum absolute value of the kernel becomes closer to its average absolute value. Consequently, a reasonable model change can be obtained in the whole medium. Another way of avoiding the problem with very large values is the smoothing of the kernel. The smoothing can be combined with truncation. Smoothing also suppresses small-scale heterogeneities of the medium, which usually appear in the process of inversion. These heterogeneities are often physically meaningless (e.g., they correspond to frequencies beyond the resolution limit). An adjoint tomography is an iterative inversion process using an adjoint method to compute a spatially dependent gradient of misfit (kernel) on model parameters changes. The gradient therefore provides information, how to modify the trial model to get a new one with smaller misfit. Hopefully, after several iterations, the model with smallest misfit is the one most similar to real medium. This method has been successfully applied in many inversion problems at global or regional scales in recent years. However, using adjoint method is not straightforward. Computed kernels are usually complicated and come with some problematic properties. We may mention singularities at the point sources and the receivers, high values at the free surface, numerical artefacts, rapidly spatially varying values and other common properties. We introduce additional means of gradient preconditioning, which can make the kernel values more evenly distributed and can get rid of the problems we mentioned before. These can lead to considerably faster convergence and to convergence to more meaningful model with a little additional work. We do this by assigning some specific spatially depending weighting function (norm) computed from values of forward and adjoint field. Abstract An application of adjoint tomography on local scales enhances these problematic kernel properties and brings new problems to solve. Usually, the medium at local scale is more complicated with stronger heterogeneities and larger velocity contrasts. Also, positions of sources and receivers coverage are more uneven. This may lead to unsatisfactory results of the inversion. Therefore, it is important to process the computed kernel before using it in inversion. Usually, the processing of kernel - called gradient preconditioning - consists of truncating excessive kernel values and then smoothing resulting kernel. A usefulness of these kernel modifications is based on the fact, that stepping in the direction of gradient (raw kernel) may not be an optimal direction to minimum in longer term. An example of the effect of kernel normalization Methods of gradient pre-conditioning for removing the effect of geometrical spreading Normalization by maximal absolute values Images show a partial kernel corresponding to the wave reflected at the free surface in a homogeneous half-space our method examples A kernel is in general computed as where v(r,t) is some forward field variable and w*(r,t) adjoint field variable. Let us define a norm and a normalized kernel as correction by geometrical spreading dividing an argument in time integration by a product of the geometrical spreadings of the forward and adjoint fields correction by the ray density dividing kernel values by a product of the total ray density of the forward and adjoint fields at each position normalization by maximal absolute values dividing kernel values by maximal absolute values (maximal in time, different at each position) of the corresponding forward and adjoint variables effect of normalization at the free surface a triangular area of a double arrival original kernel after normalization effect of normalization near source and receiver an artefact of computation Relative errors of VP in the inverted 2D models Wavefield numerical simulations homog. layer over half-space (velocity contrast = 2) Layout of inverted models: All examples are computed for 2D elastic models. In case of model with the free surface, the surface is at the top of the picture. Explosive point sources with Gabor source-time function are used. The velocity-stress staggered-grid FD scheme has been used for computations. Time window is sufficiently long for all relevant reflections to arrive. Non-reflecting boundaries are applied. homogeneous space after 1st iteration after 5th iteration using raw kernel using raw kernel number of iterations using normalized kernel using normalized kernel homogeneous half-space number of iterations number of iterations sources: receivers: misfit Conclusions Models: Starting and inverted models have the true value of VS. The starting value of VP is 1.25% smaller than the true value (smaller and greater in case of the checkerboard model). • We have developed a method of gradient preconditioning - normalization of kernel by maximal absolute values. • We demonstrate that our method is more efficient in reducing anomalous values of kernel than two other methods, correction by the geometrical spreading and correction by the total ray density. • Also, computation of norm takes almost no additional computational costs in case of the FD simulation, which is not the case with the two other methods of gradient preconditioning. • The misfit decrease in the iterative process using our method is the same as that without gradient preconditioning despite the fact that the model changes are not in the negative gradient direction (raw kernel). However, obtained models with the same misfit are closer to the true model. homogeneous space homogeneous half-space misfit 2.5% 1.0% 0.0% -1.0% -2.5% misfit checkerboard test relative errors of VP with respect to the true value homog. layer over half-space (velocity contrast = 2) number of iterations number of iterations References Igel, H., Djikpesse, H., and Tarantola, A. 1996. Waveform inversion of marine reflection seismograms for P-impedance and Poisson’s ratio. Geophys. J. Int. 124 (2), 363–371. Tromp, J. et al. 2011. Sensiblechoicesfor adjoint tomography: Misfits, model parameterization, preconditioning, smoothing and iterating. Tech. rep. QUEST workshop. number of iterations number of iterations misfit misfit misfit misfit AcknowledgementsThis work was supported in part by the Slovak Research and Development Agency under the contract No. APVV–0271–11 (project MYGDONEMOTION).