70 likes | 88 Views
This article explains the concept of bootstrap sampling and its application in assessing uncertainty in estimated quantities from data. It also discusses parametric bootstrapping and its use in estimating parameters in phylogenetic trees.
E N D
Bootstrapping Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS
Bootstrap sampling: assess uncertainty about quantity estimated from data • Starting point: N observations x1, x2, x3, ..., xN • Construct a “bootstrap sample”: • Using sampling with replacement, select N data points from the original observations • This means some data points are present more than once, some exactly once, some are not present in the bootstrap sample • Repeat many times (e.g., 1000) • From each bootstrap sample: estimate quantity of interest • The distribution of estimates indicate uncertainty CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS
Bootstrap sampling: assess uncertainty about quantity estimated from data CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Figure by Felsenstein
Bootstrapping phylogenetic trees CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS 1.00 0.74 Consensus tree Figure by Felsenstein
Bootstrap consensus tree CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Figure by Felsenstein
Parametric bootstrapping • Starting point: data set with observations • Fit model to data, estimate parameters (e.g., Θ1, Θ2, Θ3) • Using parameter estimates: generate large number of simulated datasets (e.g., 1000) • From each simulated dataset: estimate parameters • The distribution of estimates indicate uncertainty CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS
Parametric bootstrapping: phylogenies CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Figure by Felsenstein