220 likes | 234 Views
This study explores a problem-oriented solution for monitoring bivariate Poisson data in the Greek food industry, specifically focusing on the dairy sector. It investigates the need for new process monitoring techniques to ensure product safety and reduce time delays in detecting pathogenic biological factors.
E N D
A Problem Oriented Solution Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece, and PetrosE. Maravelakis, Department of Statistics and Actuarial-Financial Mathematics, University of the Aegean, Samos, Greece.
Greek industry is composed of 23 sectors and the most important of them is the food and drink sector. • This sector represents about 21% of Greek manufacturing industry, includes more than 1,300 enterprises and creates 70,000 jobs. • In 2002, Greece’s food sector was second in the European Union (out of 15 countries), in terms of growth, reaching a growth rate of 3.3% (in that period Spain hold the first place). • The first place in the food sector is taken by dairy products, which hold 24%. • The 5 main sectors of the Greek dairy industry are: milk, yogurt, cheese, ice cream, cream and butter. In the dark of Economic Crisis The Problem is related to Food Industry
The dairy industry showed great signs of improvement, in the last 10 years, mainly because of the high nutritional value of dairy products and their close relationship with the Greek diet (now is also trapped in the economic crisis). • From the preceding discussion it is clear that the dairy industry is of great importance for Greek Economy while milk is of great importance for Greeks’ diet. • Among the different categories, Greeks prefer fresh milk, which holds 47.4% of total share. • At the same time, all companies invest considerable money in terms of research and development, and installation of units to gather and process fresh milk of high quality and safety. In the dairy industry and Especially in Milk Production
In fresh milk as well as in many food processing operations, product safetyis controlled, by checking only the final product by microbiologicaland chemical methods (Tokatli et al, 2005). • A major drawback associated with this approach is time delay. Collecting and examining the samples to determine the safety of the product takes too much time(the results of the microbiological analysis are completed only after the product is released to the market). • Another drawback is that it can be a high-cost solution if any contamination is reported after the production is completed. Furthermore, the recall of the defective product and the collection from retail outlets add extra significant cost. A Closer Look to the Problem
Thus, it is clear that new process monitoring techniques are needed aiming at this type of problems. • The significance of new process monitoring techniques to deal with this type of problems arises from the fact that these cases are related to public health, since there are many diseases associated with low quality milk (or similar food products): • Leptospirosis • Cowpox. • Tuberculosis • Brucellosis • Listeria • Johne's Disease A Closer Look to the Problem
A milk pasteurization plant. A continuous pasteurization line. • Our focus was concentered in the time interval after the pasteurization is completed and before the product is released to the consumer, since any pathogenic biological factor contained in the raw material is removed with the pasteurization process. • What now if a pathogenic biological factor appears to the part of production after the pasteurization of the product? • As we already said microbiologicalmethods are applied to the final product to ensure that the milk is safe for consumption. • But also we said that the there is a time delay and that usually the exact results of the microbiological analysis are taken after the product is released to the market. The Exact Problem
Thus we need a monitoring procedure ! • But what are we going to monitor ? • Usually in that case quality control departments monitor the percentage of the non-conforming products. • Also there are few cases that the quality control departments monitor the number of microorganisms of a specific type found in a sample (microorganisms per milliliter / in a suspension created by a sample from the production line)… • In that case we monitor with an appropriate control chart a Poisson distributed test statistic… • Note here that the milk (and almost all the food) contains microorganisms that if they do not exceed a threshold can not affect human health (in some cases are also useful). The Exact Problem
But this needs time… • The Poisson based control chart is fed with measurements only after the product is at the hands of the consumer… • The quality managers are instructed that they have to wait a certain amount of time in order to proceed to the counting of microorganisms in the plate. • In that case, we are assuming that if a contamination factor exists, affects the new products in an increasing way (the effects are a function of time). • In that case, a better solution is to use CUSUM type or an EWMA control chart. The Exact Problem
But the use of these types of charts do not solve the problem, because there is the time delay, and in that case an extreme event will identified only when is too late… • A better solution is to measure the number of microorganisms (of a specific type) that are developed in a test plate (created by a sample from the production line) in many time points …. • from zero point to the final time point (and not only at the end of the time period given in the microbiological guidelines). • In that way we may be capable to observe how fast are the number of microorganisms is growing. • The idea is that if a contaminating factor exists in the production line after the pasteurization process is completed then the number of microorganisms will be growing faster. The Exact Problem
Also, if a contaminating factor make its appearance in the production line then it itself evolves (since it is a biological factor) causing continuously more and more contamination. • Thus, the proposed sampling procedure is the following: • Take one sample from the production line every k time units (say for example every 8 hours) • Define a value l for the measurements on microbiological system (say for example 6) – usually by Optical Density. • If the guidelines instruct that the number of microorganisms of a specific type must measure in r hours (say 48 hours), then perform the 1st test at the r/l (8th) hour, the 2nd test at the 2r/l (16th ) hour, …, and finally the lth test at the r (48th) hour. The Exact Problem
The null hypothesis is that the process is in control, that there is no time dependence, and that each of the components x(i,j), i=1,2,…,l=6 and j=1,2,…,+∞ follows a Poisson distribution with the parameter λl. • Thus, each time point u the sums of the form y(u)= x(1,u)+x(2,u-1)+…+x(l,u-l+1) are also Poisson random variables with parameter l∙λl. Sampling Scheme
Thus, in case we are interested in only one type of bacterium, we may apply a univariateShewhart type control chart on the statistic y(u)=x(1,u)+x(2,u-1)+…+x(l,u-l+1) for u=1,2,…,+∞. Univariate Control Chart
But what happens in the case that we have more than one types of bacterium ? Say for example 2. • In that case, we may apply the same technique in both the types of bacterium. • Thus, we conclude with two sums of the form • y(1,u)=x(1,u)+x(2,u-1)+…+x(l,u-l+1) for u=1,2,…,+∞ • y(2,u)=x(1,u)+x(2,u-1)+…+x(l,u-l+1) for u=1,2,…,+∞ • The two variables in most of the cases will be dependent, since the presence of a contaminating factor will trigger a chain reaction in the evolution of these types of bacterium. • In that case, we define the two dimensional random variable y=(y1,y2) which follows a two dimensional Poisson distribution with parameters λ1, λ2, and λ. The Bivariate Case
The two dimensional random variable y=(y1,y2) has the following probability function • This bivariate setting is actually based on the joint distribution of the variables Y1, Y2 where in general Y1=Z1+Z3 and Y2=Z2+Z3 and Z1, Z2, Z3 are mutually independent Poisson random variables with means λ1, λ2andλ3, respectively. The Bivariate Case
The next step in our methodology is to identify the variable that will be used for the monitoringthe bivariate process. • A fact that will be used to motivate the selection of this variable is that the number of the bacteria can only increase. Therefore, we are interested in a variable that will be able to detect fast this possible increase. • A straightforward selection is the sum of the two random variables Y1 and Y2which is the sum of two dependent Poisson variables, say Y. • This random variable identifies an increase in the mean of either Y1 and Y2. • The random variable Y follows a Hermite distribution (see Jonshon, Kotz and Kemp (1992)pages 357-364) with probability function The Bivariate Case
Consequently, for the identification of an out of control situation we may construct a Shewhart type control chart with limits calculated using the Hermite distribution (see Montgomery (2008)). • This chart detects a possible increase in the mean of any of the two variables. Based on 1000 repetitions. The Bivariate Case
The next step required by the nature of the problem is to see what happens after an out-of-control signal is given. • A method to identify the responsible variable is needed. • In order to identify the responsible variable after a signal we have to properly select a random variable that will help us in this direction. • Such a random variable is the difference of the two random variables Y1 and Y2, say Y’. • From the definition of the bivariate Poisson distribution we deduce that Y’=Y1-Y2=Z1-Z2, is the difference of two independent Poisson r.v. The Bivariate Case
Since we use Y’ after a signal is issued, we expect to see one of the following results • a positive value of Y’ meaning that we have an increase in Z1. • a negative value of Y’ meaning that we have an increase in Z2. • a value of Y’ close to zero meaning that both Z1 and Z2 have shifted. • Therefore, the use of Y’ assures us that we will be able to identify the responsible variable in most of the cases. The probability distribution of Y’ is known and is given in Jonshon, Kotz and Kemp (1992)pages 190-192 and it is of the form The Bivariate Case
Thus, we may use the distribution of Y’ in order to define a formal procedure for identifying the out-of-control variable. • Specifically, if the value of Y’ is above the 95% percentage point of its theoretical distribution,then responsible variable is Y1 and if the value of Y’ is below the 5% percentage point of its theoretical distribution then Y2 is the responsible variable and if the value of is between the 5% and 95% percentage point of its theoretical distribution then both variables have shifted. The Bivariate Case
Based on 1000 repetitions. Correct Identification Rates
Figen (Kosebalaban) Tokatli, Ali Cinar, Joseph E. Schlesser (2005). HACCP with multivariate process monitoring and fault diagnosis techniques: application to a food pasteurization process, Food Control, 16, 411–422. • Jonshon, N.L., Kotz, S. and Kemp, A.W. (1992). Univariate Discrete Distributions, Wiley, New York. • Montgomery, D.C. (2008). Introduction to Statistical Quality Control, Wiley, New York. References