E N D
C~PODHow it works Nick Tregenza Feb 2011
Clicks made by dolphins are not very distinctive sounds, and even ‘typical’ porpoise clicks can come from something else. Their most distinctive feature is their loudness at source, but sensitive static loggers must be able to work with weak clicks from very distant animals without knowledge of the distance to the animal. C-PODs work by collecting enough information on each click to provide a basis for the detection of more or less regular trains of similar clicks. Trains contain a lot more information than clicks and are a lot more distinctive. The key to the performance of the C-POD is not click detection or click selection, but train detection and classification – the ‘train filter’. Click selection is deliberately broad – large numbers of somewhat dolphin-like clicks are logged and the train filter finds the cetacean click trains. This presentation describes the train filter, says something on species identification, and briefly addresses the confused arguments around ‘black boxes’ that have been raised.
Many thanks to all contributors of C-POD data files. They were essential to the development process. The ‘KERNO’ classifier in CPOD.exe identifies click trains in CP1 files and produces a CP3 file.
It is helpful to keep all files in one, or a few, large directories as you cannot select across directories for this purpose, or for useful export functions that can also process batches of files. Click this button and select the batch of files you want processed. Batches totalling several GB may take many hours and can be safely run overnight. The ‘GENENC’ encounter classifier is also run unless you selected a different encounter classifier … but you probably don’t need either and don’t have to get to know them! If you have files from some old V0 PODs and also V1 files go to the ‘view+’ page and click here. This sets a higher threshold for V1 PODs to reduce the very slight difference that exists. T-POD users: TPOD.exe .pdc files correspond to .CP1, but .pdt files do not correspond to CP3 files, as .pdt files contained all the raw data, while CP3 files do not. So CP1 files must be kept.
Display during train detection:: Total number of CP1 clicks processed. Detections colour coded by frequency. The longer scale marks are 1 day apart. To avoid slowing the process the display does not include all minutes. Frequency distribution of clicks in a sample of minutes Distribution of click frequencies in trains found in the minute on show.
Definition & role Trains are more or less regular sequences of similar elements. Most cetacean clicks are produced in trains. Recognising trains is a powerful tool for identifying of the presence of cetaceans.
train sources Odontocetes = toothed cetaceans. All odontocetes use echo-location and produce click trains for this purpose. Sperm whales clicks are too low in pitch for detection by the C~POD, but all other species are likely to be detectable by the C~POD. Boat sonars produce trains with a repeated duty cycle that may contain several different inter-pulse-intervals. Shorter cycles are used in shallower water. Chance trains. Unrelated sources, like rain or clicking shrimps, often produce regular trains by chance. WUTS: weak unknown train sources exist in some water bodies. Their identity is not known but small crustaceans that may colonise the transducer housing surface are on the list of suspects.
Chance trains Rejection of ‘chance trains’ is based on a probability model of a trains. The model computes a cumulative p-value for a train as a chance event by using the prevailing rate of arrival of clicks to derive a probability of there being no click in each successive time slot as defined by the current inter-click interval and train regularity. The product of these p-values is a measure of the improbability of the train having arisen by chance from random sources. This model is dynamically modified to cope with the uncertainty in the ‘rate of arrival of clicks’ and the effects of multi-path propagation. There are serious limitations to this model, but it remains of practical use.
train classification : quality Trains are first tested to see if they fall within the parameter space occupied by chance trains from noise sources like rain, crustaceans, moving sediment or pebbles, or propeller cavitation. The coherence of the train is a key element in this assessment. The classification threshold is probabilistic, and different levels of confidence are appropriate to different studies. So four levels of confidence (Quality) are provided: Hi, Mod, Lo, and ? (Doubtful). These are the Q values. They correspond to different detection thresholds set on a ROC curve. For most users Hi + Mod is a suitably reliable set and corresponds to ‘Cet All’ in T-POD data. For specifically behavioural studies e.g. behaviour around fishing gear, it would be worth evaluating use of a lower quality trains, if they cluster in time with higher quality trains – that’s a simple test of whether they have the same source. For studies that depend on ICIs a filter is provided on the file page that excludes those in which the ICIs may be inaccurate even though the train is highly unlikely to be a chance train.
train filter : black boxes and limits • The train filter is effectively a ‘black box’ and has to be developed as such because of its complexity. Black boxes are systems which respond to an input with an output that cannot be accurately predicted from what is known of the workings. They are characterised instead by their transfer function. Examples include commercial hydrophones, most electronic lab instruments, the human brain, neural networks unless very simple, most or all complex pattern recognition algorithms, anything that requires ‘ground truthing’, etc. • Black boxes are used where the transfer function shows sufficient accuracy for the task in hand e.g. hydrophone manufacturers do not disclose the key dimensions and materials but do release detailed transfer functions. These are more accurate than a ‘transparent box’ prediction could have been. • For many instruments where a ‘transparent box’ prediction is impossible it is useful to identify, from the kind of mechanism in use, the kinds of deviation from a perfect relationship between input and output that may occur e.g. many ceramic hydrophones will show a locally elevated sensitivity around resonance; voltage meters and oscilloscopes will show increasing errors as source impedances become very high; humans will show learning, fatigue that change over time, and will have individual biases. • For the C-POD click detection there will be limits set by short click duration, high click bandwidth, low click intensity, varying sensitivity across the frequency range, and lowered sensitivity if ambient noise is high. • For the C-POD train detection the model in use (see above) allows prediction that there will be a bias against: • slow click-rate trains that will move to higher click-rates as overall ambient click rates increase; • trains above 2000 clicks/s or below 1/s are not detected; • trains with rapid changes in timing or click character; • and trains where the animal is facing away from the POD.
train classification : species After quality filtering trains are classified into species groups: NBHF: all species that produce narrow-band high frequency clicks. This group includes all porpoises, some dolphins and Kogia simus. High frequency means above 107kHz. Other cet: trains apparently from cetaceans that are not NBHF. These could be narrow band or high frequency but not both. Dolphin trains appear in this class. Their clicks are designated BBT – broad band transients. Sonars: This includes boat sonars – depth finders and fish finders, and trains in the output of some marine pingers are also classified here, because they have very long narrowband clicks. Unclassed: Some non-NBHF trains, and nearly all trains arising by chance from ambient noise, are in this class. Dolphin clicks are not very distinctive and their identification is consequently less often very confident, so they are commonly unlcassed. These two classifications: quality and species are somewhat independent (orthogonal) so a train can be rated in any one category on either scale. However, the higher quality classes do generally give more accurate species classification as these trains contain more information. Species groups ‘NBHF’ and ‘Other’ also have a reliability index in two classes, high and low. The pattern of inter-click intervals found may be inaccurate even where the species is clear, and this also has a reliability index in two classes, high and low. WUTS: ‘weak unknown train sources’ occur in some seas and trains are also classed by how far they resemble WUTS, but this does not imply that they are WUTS as none of their features are not shared with cetaceans.
To be shown a train must fall within all selected filters: Train filters: filters These are the usual filters for porpoises These are the usual filters for dolphins and use the GENENC encounter classifier. The Quality Hi and Mod settings are not in use because the ‘all Q’ box is checked. These additional filter options are on the Files page of the menu. The click filters cut across trains and are mainly applicable to raw data. The next few slides show how filters apply to a file pair – the CP1 raw data, and the CP3 train data. To try to understand the data CP3 files should always be viewed with the CP1 file.
CP3: here about a half of the clicks logged are classified into trains CP1: raw data, colour coded by frequency. Filters in force are shown here
Application of train data • ‘Detection positive minutes’, DPM, is widely used as a measure of density of animals or habitat use. A finer-grained measure ‘time present’, = the sum of the duration of species trains, may have advantages. • DPM has proved very useful in quantifying the range and time-course of impacts such as ramming wind turbine foundations, evaluating sites, etc. • Analysis of the click rate within trains may be able to identify feeding or social activity. • Behavioural analysis has been used in studies of fishery interactions ( dolphins stealing from nets) and has thrown light on changing patterns of use of windfarm sites after construction.
the end Thanks to Klaus Lucke for the data shown