Bootstrapping the Right Way
Bootstrap sampling is being touted as a simple technique that any hacker can easily employ to quantify the uncertainty of statistical estimates. However, despite its apparent simplicity, there are many ways to misuse bootstrapping and thereby draw wrong conclusions about your data and the world. This talk gives a brief overview of bootstrap sampling and discusses ways to avoid common pitfalls when bootstrapping your data.
Outline/Structure of the Talk
- An overview of bootstrap sampling, including how and why it works
- When one should and shouldn't use bootstrapping, including warnings about general issues like not drawing enough resamples
- A brief overview of confidence intervals, including the general definition, common misconceptions, and how to use them correctly
- Common bootstrapping methods to obtain confidence intervals
- How to make a decision whether the obtained confidence intervals are trustworthy
- Issues with bootstrapped confidence intervals and how to avoid them
Attendees will gain better familiarity with bootstrap sampling, confidence intervals, and when/how to use them.
Data scientists and software engineers
Prerequisites for Attendees
A basic understanding of general algorithms and statistics. I won't assume a deep familiarity with bootstrap methods, so I'll give an overview of the basic concepts required to understand the talk.