1 / 7

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS

This article explains the concept of bootstrap sampling and its application in assessing uncertainty in estimated quantities from data. It also discusses parametric bootstrapping and its use in estimating parameters in phylogenetic trees.

phyllisc
Download Presentation

CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bootstrapping Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS

  2. Bootstrap sampling: assess uncertainty about quantity estimated from data • Starting point: N observations x1, x2, x3, ..., xN • Construct a “bootstrap sample”: • Using sampling with replacement, select N data points from the original observations • This means some data points are present more than once, some exactly once, some are not present in the bootstrap sample • Repeat many times (e.g., 1000) • From each bootstrap sample: estimate quantity of interest • The distribution of estimates indicate uncertainty CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS

  3. Bootstrap sampling: assess uncertainty about quantity estimated from data CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Figure by Felsenstein

  4. Bootstrapping phylogenetic trees CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS 1.00 0.74 Consensus tree Figure by Felsenstein

  5. Bootstrap consensus tree CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Figure by Felsenstein

  6. Parametric bootstrapping • Starting point: data set with observations • Fit model to data, estimate parameters (e.g., Θ1, Θ2, Θ3) • Using parameter estimates: generate large number of simulated datasets (e.g., 1000) • From each simulated dataset: estimate parameters • The distribution of estimates indicate uncertainty CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS

  7. Parametric bootstrapping: phylogenies CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Figure by Felsenstein

More Related