Strategies for the Collection and Use of Quantitative Data

Strategies for the Collection and Use of Quantitative Data Norma Fowler Section of Integrative Biology University of Texas at Austin

Why are quantitative data not used as widely and effectively as they might be?limited funding......so...How can we obtain and use quantitative data most efficiently, given inevitable funding constraints?

Some reasons why quantitative data are not used, or are used incorrectly or inefficiently (other than funding limitations) data not collected data collected but not analyzed inefficient sampling and experimental designs statistically invalid sampling and experimental designs

data not collected: the lack of long-term monitoring long-term monitoring • reveals whether a management practice is working • provides baseline data for future studies and future management • is often relatively inexpensive but.... • it is not as ‘glamorous’ as new initiatives • the need often outlasts the duration of project funding, the employment of a staff member, etc.

Some reasons why quantitative data are not used, or are used incorrectly or inefficiently (other than funding limitations) • data not collected • data collected but not analyzed • inefficient sampling and experimental designs • statistically invalid sampling and experimental designs

data not analyzed • because bookkeeping and number-crunching aren’t as much fun as field work? • because the funding has run out? image from http://www.sostitle.com/

Some reasons why quantitative data are not used, or are used incorrectly or inefficiently (other than funding limitations) • data not collected • data collected but not analyzed • inefficient sampling and experimental designs • statistically invalid sampling and experimental designs

the three most common types of important design problems seem to be • replication • no replication • no statistically valid replication • inefficient replication • randomization • no randomization • statistically invalid randomization • number of statistical units v. number of variables

replication An example of invalid, valid but inefficient, and more efficient designs 3 habitat types: open cluster along savanna of woody plants a drainage for simplicity, assume we can only make measurements on 36 plants

replication valid but inefficient design: 2 plots per habitat type, 6 plants per plot statistically invalid design: only 1 plot per habitat type

replication alternate approach: no plots, 12 plants per habitat type stratified random plant locations statistically more efficient design: 4 plots per habitat type, 3 plants per plot

randomization some really bad ideas • ignore the whole issue and hope it goes away • throw hoops with your eyes shut • sample whenever there is a place to pull off the highway • sample every 10 m • wander around and sample at what feels like random intervals to you

randomization Humans don’t seem to be able to generate random numbers. Solutions • random number tables correctly used • random number generators in computer packages

number of statistical units v. number of variables Too many characters on too few statistical units (= plots or plants, depending on the design). Why? • Big plants are big all over; usually little is gained by measuring >1 size-related variable. • Environmental variables are often so highly correlated with each other that little is gained by measuring some of them. • If plot is the statistical unit of replication, more plants per plot usually add little to the power of the analysis.

An example of a statistically valid, relatively efficient design: the effects of deer browsing on Streptanthus bracteatus 2 treatments: fenced exclosure & control 1 cluster of plants = 1 plot 6 plots per treatment in yr 1, 9 plots per treatment in yr 2 images from http://www.wildflower2.org/NPIN/Gallery/ and http://www.thenorthview.org/recreation/animals/deer.html

preliminary study of 11 size-related measures of Streptanthus bracteatus: • stem basal diameter predicted dry biomass very well (R2 = 0.95), • the other 10 variables did not have to be measured in the experiment, and.... • using this estimate of initial biomass greatly increased the power of the analysis image from http://www.tpwd.state.tx.us/news/tv/vnr/archive/0307

Strategies for the Collection and Use of Quantitative Data