390 likes | 462 Views
Nigmatullin’s new methods related to quantitative ‘reading’ of arbitrary random sequences. Center for Self-Organizing Intelligent Systems (CSOIS), Dep. Electrical and Computer Engineering, Utah State University, Logan, UT, USA. Hu Sheng. Why author suggests these New methods.
E N D
Nigmatullin’s new methods related to quantitative ‘reading’ of arbitrary random sequences Center for Self-Organizing Intelligent Systems (CSOIS), Dep. Electrical and Computer Engineering, Utah State University, Logan, UT, USA. Hu Sheng
Why author suggests these New methods • Find the optimal and smoothed trend function and divide it from their relative fluctuations. • Transform any random sequence to the determined generalized mean value (GMV) curve express quantitatively the reduced characteristics of any random sequence in terms of a ‘universal’ set of the fitting parameters defined by the determined GMV-function. • Find the universal distribution function of the relative fluctuations for different detrended sequences. • Describe any random sequence quantitatively in order to construct the statistically homogeneous cluster or detect their unusual properties.
Three main methods • Optimal linear smoothing (POLS) • Fractional moments and Eigen-coordinates (EC) • Universal distribution function of the relative fluctuations.
Optimal linear smoothing • This method helps to find the optimal and smoothed trend (pseudo-fitting) function and divide it from their relative fluctuations. • The procedure of the optimal and smoothing is defined as: where the function defines the Gaussian kernel, the value defines the fixed width of the smoothing window. The set defines the initial noisy sequence
Optimal linear smoothing • The value of the optimal window is chosen from the condition: • The desired trend minimizing the value of the relative error is described by expression:
Optimal linear smoothing • After calculation of the optimal trend it becomes possible to divide initial random sequence on two parts: (a) the optimal trend expressed by relationship: (b) detrended sequence representing the values of the relative fluctuations, which is expressed as: where defines the detreded sequence of the relative fluctuations.
Kernel smoother • The essence of the author’s optimal linear smoothing is the “Kernel smoother” • A kernel smoother is a statistical technique for estimating a real valued function by using its noisy observations, when no parametric model for this function is known. • the Nadaraya-Watson kernel-weighted average (smooth Y(X) estimation) is defined by: N is the number of observed points.
Kernel smoother • Where is the kernel defined by is the Euclidean norm is a parameter (kernel radius) typically is a positive real valued function, • Little or no training is required for operation of the kernel smoother. Actually, the kernel smoother represents the set of irregular data points as a smooth line or surface.
Three main methods • Optimal linear smoothing (POLS) • Fractional moments and Eigen-coordinates (EC) • Universal distribution function of the relative fluctuations.
Eigen-coordinates (EC) • Basic ideas of the Eigen-coordinates method • Suppose that a composite hypothesis where is -dimensional fitting vector, is a current (measured) variable located in the given (admissible) interval ,And satisfies differential equation (DE) of the th order written relative to the variable , and the set of the constants enters into it in a linear way.
Eigen-coordinates (EC) • After the integration of the initial DE for times between the given limits or and the consequent elimination of the constants related to the values of unknown derivatives in initial ( ) or finite ( ) points it could be transformed to the basic linear relationship: • Example An second order DE: Integrating above equation two times in the interval then
Eigen-coordinates (EC) • Where is a value of the first derivative in the point the above equation can be transformed to the linear relationship : where
Eigen-coordinates (EC) • Using the method of least squares (MLS) one can find the set of the fitting parameters .After the calculation the EC of the initial (zero) level can be obtained from the relationships: where • But the sensitivity of the initial EC is not sufficient for a reliable identification of an other admissible function. As the result of such transformation a relationship of the following type is obtained:
Eigen-coordinates (EC) • The new variables are related to initial ones by the following equations: • The symbol defines the arithmetic mean value for the given samples having N points:
Eigen-coordinates (EC) • In the same way we can get: • The EC’s method could be applied as a general approach for many experimental situations when a researcher is being forced to analyze measured data of large deviations.
Advantage of the Eigen-coordinates • After integration this distribution has less error and the calculated slope practically coincides with the mean value of the initial distribution. The idea of replacing the initial function by the corresponding integral of initial data containing small deviations (and thereby having small value of variance), could be applied to functions if the fitting parameters enter the initial hypothesis by a linear way. • It is possible to reduce this nonlinear fitting problem existing for a wide class of functions to the linear one . The ECs method that reduces the problem to the MLS gives definitely a positive answer.
Fractional moments • Fractional moments is based on the generalization of the conceptions of integer moments. The method define the generalized moments of any order and operate with a function depending on the index of a moment . In order to consider all moments in the dimensionless units one can define the generalized mean value (GMV). • The GMV is defined for a system of points located in the band by the expression
Basic properties of GMV • For , GMV coincides with the value of the harmonic mean: • For , GMV coincides with the value of the value of the geometric mean: • For , GMV coincides with the value of the value of the arithmetic mean.
Basic properties of GMV • The limiting values are defined as: • The function for positive values of is always monotonic function • If the set of is positive then GMV can be easily generalized for any continuous value of based on the conception of the fractional moments for non-integer values
GMV • An analytical expression: • The approximate analytical expression for the GMV function: • One can use the ECs method to facilitate the calculation of the desired fitting parameters .
Advantages of GMV • The approximate analytical expression provides a ‘universal’ quantitative reduction of any random sequence to a set of parameters. • These fitting parameters allow in separation of the values (amplitudes) of a random sequence onto the optimal statistical groups (clusters) n with parameters that correspond to the reduced description of the random sequence considered. • This reduced presentation can be more informative with respect to an external factor. So we can efficiently find the desired range of the parameters corresponding to its optimal work.
Three main methods • Optimal linear smoothing (POLS) • Fractional moments and Eigen-coordinates (EC) • Universal distribution function of the relative fluctuations.
Universal distribution function of the relative fluctuations (UDFRF) • The author’s method is under preparation. It was related in proving of existence of a universal distribution function of the relative fluctuations for different detrended sequences. • The author try to find and justify a new class of universal distributions that follows from the linear principle of strongly correlated variables (LPSCV). That is to say we can find a new class of distributions that can describe the envelopes of sequences of ranged amplitudes (SRA) satisfying the LPSCV.
UDFRF • The definition of the SRA is a random sequence in the band . When the amplitudes satisfy the condition . (a) A ranged (sorting) procedure applied to some random sequence does not contain any additional errors. (b) The function R(N), pretending to describe the envelope of a given SRA, cannot have an analytical expression for the inverse function N(R). This conclusion follows from the careful analysis of expression and the formulas obtained below for more general cases.
UDFRF • If two SRAs, and (having the same number of references N), are plotted against each other and the obtained dependence is approximately described by a straight line ( and are some constants), we will refer to such sequences as strongly correlated. • Many sequence measured for some period of time strongly correlation is close to a straight line. From above we can write the following simplest functional equation : where is the period of time corresponding to one measurement cycle
UDFRF • For non-integer values of T, above equation has the solution: where is an unknown periodical function with the period T. the exponential parameter determines the effect of the previous sequence on the consequent one. • Generalized function equation of the type: it can be written as: where is a set of periodical functions with the period T. We can easily get:
Advantage of UDFRF • The scaling properties of these relative fluctuations (presented in the form of SRAs) relationships that is very important for analysis of different randomness. • From above we can proved that in many cases these strongly correlated sequences follow new statistics. • It can replace the conventional statistics, according to which high-frequency fluctuations are described by a Gaussian distribution. UDFRF is a new possibility of isolating a trend from its noise and analyzing the high-frequency fluctuations separately from the corresponding trend.
Properties of the new methods • These methods are noninvasive i.e. they contain only controllable errors related to transformations of the random sequences considered. • Using new methods any random sequence can be read 'quantitatively' and, if it is necessary, can be compared with another sequence with the usage of a 'universal' set of the reduced (fitting) parameters. • These suggested methods are completely free from any a priori (model) suggestion related to statistical nature of the random sequence analyzed.
Conclusion and expectation • These three methods combined together will be extremely effective in quantitative description of any random sequence (in order to construct the statistically homogeneous cluster (containing a set of the reduced parameters) or for detection of their unusual (marginal) properties.
Problems that can be solved with the help of new methods • Comparison with pattern equipment and self-verification of the readiness of complex equipment to measurements • The diagnosis of deceases based on quantitative "reading" of noises, which are registered inside a human body • The increasing of sensitivity of existing spectrometers and creation of supersensitive gas-sensors and chromatographers • Applications to different nanotechnologies
References • [1] Nigmatullin R.R., "Eigen-Coordinates: New method of identification of analytical functions in experimental measurements". - Applied Magnetic Resonance, vol. 14 (1998) pp.601-633. • [2] R.R. Nigmatullin. "The statistics of the fractional moments: new method of quantitative reading of random sequence." Scientific notes of KSU vol. 147 (book.2), (2005) pp. 129-161. (in Russian). • [3] R.R. Nigmatullin. "Strongly correlated variables and existence of the universal disctribution function for relative fluctuations." J. of Wave Phenomena. vol.16. (№2) (2008).