1 / 48

Outline

Outline. Time series prediction Find k-nearest neighbors Lag selection Weighted LS-SVM. Time series prediction. Suppose we have an univariate time series x ( t ) for t = 1, 2, …, N . Then we want to know or predict the value of x ( N + p ).

azuka
Download Presentation

Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Outline • Time series prediction • Find k-nearest neighbors • Lag selection • Weighted LS-SVM

  2. Time series prediction • Suppose we have an univariate time series x(t) for t = 1, 2, …, N. Then we want to know or predict the value of x(N + p). • If p = 1, it would be called one-step prediction. • If p > 1, it would be called multi-step prediction.

  3. Flowchart

  4. Find k-nearest neighbors • Assume the current time index is 20. • First we reconstruct the query • Then the distance between the query and historical data is

  5. Find k-nearest neighbors • If k = 3, and the first k closest neighbors are t14, t15, t16. Then we can construct the smaller data set.

  6. Flowchart

  7. Lag selection • Lag selection is the process of selecting a subset of relevant features for use in model construction. • Why we need lag? • Lag selection is like feature selection, not feature extraction.

  8. Lag selection • Usually, the lag selection can be divided into two broad classes: filter method and wrapper method. • The lag subset is chosen by an evaluationcriterion, which measures the relationship of each subset of lags with the target or output.

  9. Wrapper method • The best lag subset is selected according to the model. • The lag selection is a part of the learning.

  10. Filter method • In this method, we need the criterion which can measures the correlation or dependence. • For example, correlation, mutual information, … .

  11. Lag selection • Which is better? • The wrapper method solve the real problem, but need more time. • The filter method provide the lag subset which perform the worse result. • We use the filter method because of the architecture.

  12. Entropy • The entropy is a measure of uncertainty of a random variable. • The entropy of a discrete random variable is defined by • 0log0 = 0

  13. Entropy • Example, let • Then

  14. Entropy

  15. Entropy • Example, let • Then

  16. Joint entropy • Definition: The joint entropy of a pair of discrete random variables (X, Y) is defined as

  17. Conditional entropy • Definition: The conditional entropy is defined as • And

  18. Proof

  19. Mutual information • The mutual information is a measure of the amount of information one random variable contains about another. • It’s the extended notion of the entropy. • Definition: The mutual information of the two discrete random variables is

  20. Proof

  21. The relationship between entropy and mutual information

  22. Mutual information • Definition: The mutual information of the two continuous random variables is • The problem is that the  joint probability density function of X and Y is hard to compute.

  23. Binned Mutual information • The most straightforward and widespread approach for estimating MI consists in partitioning the supports of X and Y into bins of finite size

  24. Binned Mutual information • For example, consider a set of 5 bivariate measurements, zi=(xi, yi), where i = 1, 2, …, 5. And the values of these points are

  25. Binned Mutual information

  26. Binned Mutual information

  27. Binned Mutual information

  28. Estimating Mutual information • Another approach for estimating mutual information. Consider the case with two variables. The 2-dimension space Z is spanned by X and Y. Then we can compute the distance between each point.

  29. Estimating Mutual information • Let us denote by the distance from to its k-nearest neighbor, and by and the distances between the same points projected into the X and Y subspaces. • Then we can count the number nx(i) of points xj whose distance from xi is strictly less than , and similarly for y instead of x.

  30. Estimating Mutual information

  31. Estimating Mutual information • The estimate for MI is then • Alternatively, in the second algorithm, we replace nx(i) and ny(i) by the number of points with

  32. Estimating Mutual information

  33. Estimating Mutual information

  34. Estimating Mutual information • Then

  35. Estimating Mutual information • For the same example, k = 2 • For the point p1(0, 1) • For the point p2(0.5,5)

  36. Estimating Mutual information • For the point p3(1,3) • For the point p4(3,4)

  37. Estimating Mutual information • For the point p5(4,1) • Then

  38. Estimating Mutual information • Example • a=rand(1,100) • b=rand(1,100) • c=a*2 • Then

  39. Estimating Mutual information • Example • a=rand(1,100) • b=rand(1,100) • d=2*a + 3*b • Then

  40. Flowchart

  41. Model • Now we have a training data set which contains k records, then we need a model to predict.

  42. Instance-based learning • The points that are close to the query have large weights, and the points far from the query have small weights. • Locally weighted regression • General Regression Neural Network(GRNN)

  43. Property of the local frame

  44. Property of the local frame

  45. Weighted LS-SVM • The goal of the standard LS-SVM is to minimize the risk function: • Where the γ is the regularization parameter.

  46. Weighted LS-SVM • The modified risk function of the weighted LS-SVM is • And

  47. Weighted LS-SVM • The weighted is designed as

More Related