Confidence Intervals “By Hand”

One approach for constructing confidence intervals in R is “by hand”, where the user computes the estimate and the moe. This approach is the only one available if no function for constructing the desired confidence interval exists (that will not be the case for the confidence intervals we will see, but may be the case for others).

Let’s consider a confidence interval for the population mean when the population standard deviation is known. The estimate is the sample mean, \(\bar{x}\), and the moe is \(z_{\frac{1 - C}{2}}\frac{\sigma}{\sqrt{n}}\), where \(C\) is the confidence level, \(z_{\alpha}\) is the quantile of the standard Normal distribution such that \(P(Z > z_{\alpha}) = 1 - \Phi(z_{\alpha}) = \alpha\), \(\sigma\) is the population standard deviation (presumed known, which is unrealistic), and \(n\) is the sample size. So the two-sided confidence interval is:

\[\bar{x} \pm z_{\frac{1 - C}{2}}\frac{\sigma}{\sqrt{n}}\]

Such an interval relies on the Central Limit Theorem to ensure the quality of the interval. For large \(n\), it should be safe (with “large” meaning over 30, according to DeVore), and otherwise one may want to look at the shape of the distribution of the data to decide whether it is safe to use these procedures (the more “Normal”, the better).

The “by hand” approach finds all involved quantities individually and uses them to construct the confidence intervals.

Let’s construct a confidence interval for sepal length of versicolor iris flowers in the iris data set. There are 50 observations, so it should be safe to use the formula for the CI described above. We will construct a 95% confidence interval assume that \(\sigma = 0.5\).

# Get the data
vers <- split(iris$Sepal.Length, iris$Species)$versicolor
xbar <- mean(vers)
xbar
## [1] 5.936
# Get critical value for confidence level
zstar <- qnorm(0.025, lower.tail = FALSE)
sigma <- 0.5
# Compute margin of error
moe <- zstar * sigma/sqrt(length(vers))
moe
## [1] 0.1385904
# The confidence interval
ci <- c(Lower = xbar - moe, Upper = xbar + moe)
ci
##   Lower   Upper 
## 5.79741 6.07459

Of course, this is the long way to compute confidence intervals, and R has built-in functions for computing most of the confidence intervals we want. The “by hand” method relies simply on knowing how to compute the values used in the definition of the desired confidence interval, and combining them to get the desired interval. Since this uses R as little more than a glorified calculator and alternative to a table, I will say no more about the “by hand” approach (of course, if you are computing a confidence interval for which there is no R function, the “by hand” approach is the only available route, though you may save some time and make a contribution to the R community by writing your own R function for computing this novel confidence interval in general).