410 likes | 554 Views
The Kalman filter - and other methods. Anders Ringgaard Kristensen. Outline. Filtering techniques applied to monitoring of daily gain in slaughter pigs: Introduction Basic monitoring Shewart control charts DLM and the Kalman filter Simple case Seasonality Online monitoring
E N D
The Kalman filter- and other methods Anders Ringgaard Kristensen
Outline • Filtering techniques applied to monitoring of daily gain in slaughter pigs: • Introduction • Basic monitoring • Shewart control charts • DLM and the Kalman filter • Simple case • Seasonality • Online monitoring • Used as input to decision support
”E-kontrol”, slaughter pigs • Quarterly calculated production results • Presented as a table • A result for each of the most recent quarters and aggregated • Sometimes comparison with expected (target) values • Offered by two companies: • Dansk Landbrugsrådgivning, Landscentret (as shown) • AgroSoft A/S • One of the most important key figures: Average daily gain
Average daily gain, slaughter pigs • We have: • 4 quarterly results • 1 annual result • 1 target value • How do we interpret the results? • Question 1: How is the figure calculated?
How is the figure calculated? • The basic principles are: • Total (live) weight of pigs delivered: xxxx • Total weight of piglets inserted: −xxxx • Valuation weight at end of the quarter: +xxxx • Valuation weight at beginning of the quarter: −xxxx • Total gain during the quarter yyyy • Daily gain = (Total gain)/(Days in feed) • Registration sources? • * Slaughter house – rather precise • ** Scale – very precise • *** ??? – anything from very precise to very uncertain * ** *** ***
First finding: Observation error • All measurements are encumbered with uncertainty (error), but it is most prevalent for the valuation weights. • We define a (very simple) model: • = + eo , where: • is the calculated daily gain (as it appears in the report) • is the true daily gain (which we wish to estimate) • eo is the observation error which we assume is normally distributed N(0, o2) • The structure of the model (qualitative knowledge) is the equation • The parameters (quantitative knowledge) is the value of o (the standard deviation of the observation error). It depends on the observation method.
Observation error • = + eo , eo» N(0, o2) • What we measure is • What we wish to know is • The difference between the two variables is undesired noise • We wish to filter the noise away, i.e. we wish to estimate from
Second finding: Randomness • The true daily gains vary at random. • Even if we produce under exactly the same conditions in two successive quarters the results will differ. We shall denote the phenomenon as the “sample error”. • We have, = + es, where • es is the sample error expressing random variation. We assume es» N(0, s2) • is the underlying permanent (and true) value • This supplementary qualitative knowledge should be reflected in the stucture of the model: = + eo = + es + eo • The parameters of the model are now: sog o
Sample error and measurement error • What we measure is • What we wish to know is • The difference between the two variables is undesired noise: • Sample noise • Observation noise • We wish to filter the noise away, i.e. we wish to estimate from
The model in practice: Preconditions • The model is necessary for any meaningful interpretation of calculated production results. • The standard deviation on the sample error, s, depends on the natural individual variation between pigs in a herd and the herd size. • The standard deviation of the observation error, o , depends on the measurement method of valuation weights. • For the interpretation of the calculated results, it is the total uncertainty, , that matters (2 = s2 + o2) • Competent guesses of the value of using different observation methods (1250 pigs): • Weighing of all pigs: =3 g • Stratified sample: =7 g • Random sample: =20 g • Visual assessment: =29 g
Different observation methods =3 g =7 g =20 g =29 g
The model in practice: Interpretation • Calculated daily gain in a herd was 750 g, whereas the expected target value was 775 g. • Shall we be worried? • It depends on the observation method! • A lower control limit (LCL) is the target minus 2 times the standard deviation, i.e. 775 – 2 • Using each of the 4 observation methods, we obtain the following LCLs: • Weighing of all pigs: 775 g – 2 x3 g = 769 • Stratified sample: 775 g – 2 x7 g = 761 • Random sample: 775 g – 2 x 20 g = 735 • Visual assessment: 775 g – 2 x 29 g = 717
Third finding: Dynamics, time • Daily gain in a herd over 4 years. • Is this good or bad?
Modeling dynamics • We extend our model to include time. • At time n we model the calculated result as follows: • n = sn + eon = + esn + eon • Only change from before is that we know we have a new result each quarter. • We can calculate control limits for each quarter and plot everything in a diagram: A Shewart Control Chart … … 1 2 3 4 1 2 3 4
Interpretation: Conclusion • Something is wrong! • Possible explanations: • The pig farmer has serious problems with fluctuating daily gains. • Something is wrong with the model: • Structure – our qualitative knowledge • Parameters – the quantitative knowledge (standard deviations).
More findings: n = + esn + eon • The true underlying daily gain in the herd, , may change over time: • Trend • Seasonal variation • The sample error esn may be auto correlated • Temporary influences • The observation error eon is obviously auto correlated: • Valuation weight at the end of Quarter n is the same as the valuation weight at the start of Quarter n+1
”Dynamisk e-kontrol” • Developed and described by Madsen & Ruby (2000). • Principles: • Avoid labor intensive valuation weighing. • Calculate new daily gain every time pigs have been sent to slaughter (typically weekly) • Use a simple Dynamic Linear Model to monitor daily gain • n = n + esn + eon = n + vn , where vn» N(0, v2) • n = n-1 + wn, where wn» N(0, w2) • The calculated results are filtered by the Kalman filter in order to remove random noise (sample error + observation error)
”Dynamisk E-kontrol”, results • Raw data to the left – filtered data to the right • Figures from: • Madsen & Ruby (2000). An application for early detection of growth rate changes in the slaughter pig production unit. Computers and Electronics in Agriculture 25, 261-270. • Still: Results only available after slaughter
Example Observation equation n = n + vn , vn» N(0, v2) System equation n = n-1 + wn, wn» N(0, w2) General, first order Observation equation Yt = t + vt , vn» N(0, v2) System equation t = t-1+ wn, wn» N(0,w2) The Dynamic Linear Model (DLM) 1 2 3 4 1 2 3 4 1 2 3 4 Y1 Y2 Y3 Y4 1 2 3 4
Extending the model • Fnn is the true level described as a vector product. • A general level, 0n, and 4 seasonal effects 1n, 2n, 3n and 4n are included in the model. • From the model we are able to predict the expected daily gain for next quarter. • As long as the forecast errors are small, production is in control (no large change in true underlying level)!
Observed and predicted Blue: Observed Pink: Predicted
The last model • Dynamic Linear Model • Structure of the model (qualitative knowledge): • Seasonal variation allowed (no assumption about the size). • The general level as well as the seasonal pattern may change over time. • Are those assumptions correct? • Parameters of the model: • The observation and sample variance and the system variance. • The model learns as observations are done, and adapts to the observations over time. • Seasonal varation may be modeled more sophistically as demonstrated by Thomas Nejsum Madsen in FarmWatch™
Moral • If we wish to analyze the daily gain of a herd you need to: • Know exactly how the observations are done (and know the precision). • Know how it may naturally develop over time. • Without professional knowledge you may conclude anything. • Without a model you may interpret the results inadequately. • Through the structure of the model we apply our professional knowledge to the problem.
On-line monitoring of slaughter pigs: PigVision • Innovation project led by Danish Pig Production: • Danish Institute of Agricultural Sciences • Videometer (external assistance) • Skov A/S • LIFE, IPH, Production and Health • Continuous monitoring of daily gain while still in herd: • Dynamic Linear Models • Chance of interference in the fattening period • Adaptation of delivery policy
PigVision: Principles • A camera is placed above the pen. • In case of movements a series of pictures are recorded and sent to a computer. • The computer automatically identifies the pig (by use of a model) and calculates the area (seen from above). • If the computer doesn’t belief that a pig has been identified, the picture is ignored. • The area is converted to live weight (using a model). • Through many pictures, the average weight and the standard deviation are estimated. Figure by Teresia Heiskanen
What is online weight assessment used for? • Continuous monitoring of gain. • Collection of evidence about growth capacity (learning) • Adaptation of delivery policies depending on: • Whether the pigs grow fast or slowly • Whether the uniformity is small or big • Whether a new batch of piglets is ready • Prices • Direct advice about pigs to deliver
The decision support model • Technique: • A hierarchical Markov Decision Process (dynamic programming) with a Dynamic Linear Model (DLM) embedded. • Every week, the average weight and the standard deviation is observed • After each observation the parameters of the DLM are opdated using Kalman filtering: • Permanent growth capacity of pigs, L • Temporary deviation, e(t) • Within-pen standard deviation, (t) • Decisions based on (state space): • Number of pigs left • Estimated values of the 3 parameters • Decision: Deliver all pigs with live weight bigger than a threshold • Uncertainty of knowledge is directly built into the model through the DLM
On-line weight assessment • Pen with n pigs is monitored. • No identification of pigs. • At any time t we have: • The precision 1/s2 is assumed known
Objectives • Given the on-line weight estimates to assign an optimal delivery policy for the pigs in the pen. • Sequential (weekly) decision problem with decisions at two levels: • Slaughtering of individual pigs (the price is highest in a rather narrow interval) • Terminating the batch (slaughter all remaining pigs and insert a new batch of weaners)
A dynamic linear weight model, I • Known average herd specific growth curve: • True weights at time t distributed as:
The scaling factor L • In principle unknown and not directly observable • Initial belief: • The belief is updated each time we observe a set of live weights from the pen. • Let L» N(1, L2) be the true average weight • Then
Observation & system equation 1 • Full observation equation for mean: • Auto-correlated sample error (system eq.):
Observation & system equation 2 • Far more information available from the observed live weights • Sample variance not normally distributed. • Use the 0.16 sample quantile: • The symbol r(t) is the standard deviation of the observed values. System equation: