When the bootstrap doesn’t work

6 September 2024

0

The bootstrap always works, except sometimes.

By ‘works’ here, I mean in the weakest senses that the large-sample bootstrap variance correctly estimates the variance of the statistic, or that the large-scale percentile bootstrap intervals have their nominal coverage. I don’t mean the stronger sense that someone like Peter Hall might use, that the bootstrap gives higher-order accurate confidence intervals. So the bootstrap ‘works’ for the median, even though not as well as for smooth functions of the mean.

Here are the reasons I know of why the bootstrap might fail

0. Correlation. The one that everyone knows about nowadays. If your data have structure, such as a time series, a spatial map, a carefully-structured experimental design, a multistage survey, a network, then you can’t hope to get the right distribution by resampling in a way that doesn’t respect that structure.

1. Constraints: Suppose $X_{n} \sim N (θ, 1)$ and we know $θ \geq 0$ . The maximum likelihood estimator of $θ$ is $^θ = max (¯ X, 0)$ . If $θ > 0$ there isn’t a problem asymptotically (or at a more sophisticated analysis, if $θ ≫ 1 / \sqrt{n}$ there isn’t). But if $θ = 0$ the sampling distribution of $^θ$ is a 50:50 mixture of a spike at zero and the positive half of a $N (0, n^{- 1})$ distribution. The bootstrap distribution is also a mixture of a spike at zero and and a half-normal, but the mass on the spike does not converge to 0.5 (or to anything else) as the sample size increases. The problem is that the height of the spike is $Φ (¯ X \sqrt{n})$ , so the height converges in distribution to $U (0, 1)$ .

2. Extrema. Consider $X \sim U (θ, 1)$ . The bootstrap replicates $θ^{*}$ have a distribution that puts mass $0.632 = 1 - e^{- 1}$ on the smallest observation, $e^{- 1} (1 - e^{- 1}) \approx 0.233$ on the second smallest, and so on geometrically. We always have $θ^{*} \geq^θ$ , and the bootstrap distribution stays very discrete as the sample size increases.

3. Lack of smoothness (cube-root asymptotics) Tukey’s shorth, the mean of the shortest half of the data, converges to the mean at $n^{- 1 / 3}$ rate instead of the usual $n^{- ½}$ . The same is true for the least-median-of-squares regression line, the isotonic

When the bootstrap doesn’t work

Run Local AWS Cloud Stack using LocalStack on Linux

Learn Terraform Automation in 3 days using Video Courses

How To Expose Ansible AWX Service using Nginx Ingress

LEAVE A REPLY Cancel reply

Most Popular

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

Interview With Willem Dewulf – CEO of ProBackup by Shauli Zacks

Recent Comments

EDITOR PICKS

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

POPULAR POSTS

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

POPULAR CATEGORY

ABOUT US

FOLLOW US