420 likes | 432 Views
This topic explores the accuracy aspects in statistical monitoring of fisheries, including population studies, generation of populations, sample-based catch/effort surveys, and estimation processes.
E N D
Accuracy aspects in fisheries statistical monitoring By Constantine Stamatopoulos, PhD Senior Advisor – Fisheries Statistics
Topic One About STATISTICS • Statistics is a NATURAL SCIENCE dealing with RISKS in stating conclusions and/or taking actions • It uses mathematics but it is not a mathematical science • There are statistical phenomena that are fundamental but cannot be proved mathematically
Topic One POPULATIONS • Statistics is applied to POPULATIONS that are sets of elements • Elements can be anything: such as results of experiments, people, fish, livestock, etc. • Knowing the population under study is the first and most important factor in statistical operations
Topic One Examples of fishery populations • Landings occurred during a month by a specific boat/gear category • Number of days at sea during a month • State of activity (0=not active, 1=active) of all boats during a month • Sex of fish in a stock (practically infinite) • Length of fish (practically infinite) • Weight of fish (practically infinite)
Topic One Generation of populations • A specific factor (i.e. fishing) generates EVENTS • Events generate POPULATIONS depending on the way we look at them Examples • Fishing during November 2010 is an EVENT • If we observe daily landings then these landings is the population under study • If we observe monthly landings declared by fishermen this is a different population • Irrespective of how we determine our populations our objective is to calculate the same variable (total landings)
Topic One - Example TWO POPULATIONS Looking at the 10 boats that fished during the month: Boat 1 1500 2 2000 3 8000 4 1300 5 500 6 12400 7 2556 8 2600 9 987 10 8757 TOT: 40,600 kg EVENT: Fishing during February 2011 Looking at total daily landings (in Kg): Day 1 100 2 200 3 300 . . . 28: 2,800 TOT: 40,600 kg
Topic One Reasons for implementing sample-based catch/effort surveys • Small-scale fisheries may involve thousands of fishing units scattered along the coastline • It is generally accepted that complete enumeration of all fishing operations (monitoring through census in space and time) is impractical
Topic One Spatial and temporal considerations - Home ports are distinguished from landing sites. - Fishing effort is measured at HOME PORTS. - Catch details are collected at LANDING SITES. - But a place can be a HOMEPORT as well as a LANDING SITE. - Typical components of an estimation context: A calendar month, a geographical stratum and a homogeneous group of boats and gears and/or fishing methods (Operational Unit – OU).
Topic TwoGeneric approaches for estimatingcatch and fishing effort
Topic Two Generic process within an ESTIMATION CONTEXT Time extrapola- tion factors MONTH – STRATUM – OPERATIONAL UNIT Temporal extrapolati-on factors x Prob. Boat Active (PBA) Estimated catch CPUE = x Spatial extrapola- tion factors x APPROACH REPEATED FOR EACH CONTEXT
Topic Two Generic process for estimating catch Data acquisition Landings survey: Catch details are sampled, together with prices and average fish size by species. These are the BASIC catch variables of a system for traditional fisheries. CPUE
Topic Two Generic process for estimating catch Effort survey: There are alternative ways for determining the boat activity level. For operational reasons and within the same data collection programme, there may exist different data schemes for different operational units Prob. Boat Active (PBA)
Topic Two Two possible ways for estimating the PBA Sum of active boats divided by the sum of examined boats (“boat “or “vertical” method) Prob. Boat Active (PBA) Sum of active days divided by the sum of examined days (“day” or “horizontal” method)
Topic Two Adaptation of PBA estimation to ANY data scheme only needs a 4-column table Prob. Boat Active (PBA)
Topic Two Generic process for estimating catch Usually it expresses the number of calendar days. Calendar days may be reduced due to bad weather, holidays, closed season, etc. Temporal extrapolati-on factors x They express the number of boats by homeport and operational unit. This information is obtained from boat surveys or from vessel databases. Spatial extrapola- tion factors x
Topic Two Notes on extrapolating the PBA • They express the number of boats by homeport and operational unit. This information is obtained from boat surveys or from vessel databases. • Seasonal changes of homeports and/or gears should be handled by a special utility (see FLOUCA 1) to dynamically create sampling frames that are synchronized with catch/effort surveys. • A boat may be categorized under more than one OU due to seasonal or concurrent use of different gear. In such a case THE SUM OF ALL FISHING UNITS in OU’s will be >= the total number of vessels (“virtual” fleet). This does not create double counting since each estimation is done within each estimation context. Spatial extrapola- tion factors x
Topic Two Typical functions of a sample-based fisheries statistical set of utilities Maintaining statistical standards and classifications (species, gears, strata, etc.) Reports on primary data Reports on estimated data Integration of data from various fisheries Data diffusion Inputting of sample data on production and effort Editing and storage services for primary sample data. Estimation of catch and effort Maintaining vessel records or sampling frames
Topic Two This section suggests that: A computer system or an equivalent integrated set of utilities used in sample-based fishery surveys should be flexible enough to adapt to any scheme employed in the field for collecting the data. If the sampling frames (i.e. effort extrapolation factors) are to be automatically created from vessel databases, then the generating approach should take into consideration eventual seasonality aspects of home ports and/or fishing gears.
The concept of estimating population variables by means of sampling By counting the numbers of small and large circles in a limited area, we can estimate their total number by raising to the total area of the rectangle
The concept of estimating population variables by means of sampling When we sample we cannot see the entire population. We see only a small proportion of it. It is important that we make every effort to ensure that this small part is REPRESENTATIVE. Else we run the risk of biased estimates. Biased sample Representative sample Biased sample
Biased estimates The term “bias” implies a systematic sampling error, so that estimates are ALWAYS higher or ALWAYS lower than the real population value The problem is that we DO NOT KNOW IT
Topic ThreeSampling vs. Gambling It is at times advocated that one or two samples can do the job better than 100 samples and it thus just a matter of chance to obtain good estimates. The above statement is only partially true. It omits the fact that the RISK of obtaining bad estimates with few samples is far bigger than that of working with large samples. “Lucky” users who rely on small samples can get away once or twice. But in regularly conducted sampling surveys they, in the end, are bound to be losers.
Topic ThreeACCURACY – Why an issue?Latest statistical policies in EE, FAO and regional bodies require that :ALL ESTIMATES ARE SHOWN ALONG WITH INDICES OF RELIABILITY
Topic ThreeOVERALL ACCURACY AND ITS COMPONENTS SPATIAL Accuracy, i.e. sufficient sample size for a given level, such 90%. TEMPORAL Accuracy, i.e. frequency of sampling for a given level, such as 90%. UNIFORMITY of samples, so that their sufficient amount is evenly distributed over sufficient days. In other words…
Topic ThreeOVERALL ACCURACY AND ITS COMPONENTS We want something like this…
Topic Three And nothing like this…
Topic Three Or this…
Topic ThreeOVERALL ACCURACY AND ITS COMPONENTS There are simple arithmetic methods for computing the SPATIAL accuracy, the TEMPORAL Accuracy and the UNIFORMITY of samples. The resulting OVERALL ACCURACY should be part of the reported data, so as for both data producers and data users to be able to assess the reliability of estimates. (see following two examples…)
Topic Three ACCURACY IN 0-1 (or NO-YES) POPULATIONS • Variables of the 0-1 type (such as boat activity status, sex, etc.) require about three times more samples for a given level of accuracy. • In large populations an accuracy of 90% requires 96 Y/N samples but only 32 samples of the other types. • Likewise an accuracy of 95% requires 384 Y/N samples and only 128 samples of the other types.
Topic Three ACCURACY IN 0-1 (or NO-YES) POPULATIONS • When questionnaires contain Y/N elements that are mixed with variables of other types then, for a given level of accuracy, data collection is conditioned by the presence of such variables
Topic Three This section suggests that: An effective connecting link of reliability among different populations is the sampling accuracy. An effective sample-based fishery survey programme should monitor the accuracy and accompany reported estimated variables with accuracy indicators for the benefit of both data producers and data users. Insufficient sample size and/or limited temporal coverage, when exercised regularly, will certainly result in spurious results.
Topic FourUnderstanding the target populations and statistical units
Topic FourUnderstanding the target populations and statistical units Target populations are determined by assuming that a hypothetical census that represents the limit of a given sampling scheme has been conducted. Statistical units are the contents of the hypothetical datasets of answers. Alternative sampling schemes for the same variable (i.e. effort PBA) will result in different populations of different sizes. As a result of this each alternative sampling scheme has its own sampling requirements for achieving a commonly accepted level of minimum accuracy.
Topic FourStatistical trade-off Assuming that alternative data collection schemes are being examined for the same variable (here the PBA is taken as example), then the more “synthetic” the census questions-answers the fewer samples are needed for a given accuracy level. Assuming that alternative data collection schemes are being examined for the same variable (here the PBA is taken as example), then the simplest the census questions-answers that involve little or zero measurement, the more samples are needed for a given accuracy level.
Topic FourStatistical trade-infor a minimum level of 90% accuracy of PBA Daily “boat” survey Monthly effort survey Few samples needed “Synthetic knowledge”: Risk of measurement error: No. of days worked separately for each gear Several samples needed 0-1 or Yes-No answers: little risk of measurement error
Topic Four This section suggests that: Selection of a data scheme among alternative solutions has direct implications on the type and size of the target population to be surveyed and therefore on the sampling requirements for a given level of accuracy. Data collection schemes requiring fewer samples are not essentially preferable to those requiring larger samples, since they might involve “synthetic questioning/answering” that increase the risk of measurement error.