Confidence Interval Basics

The sample mean, \(\bar{x}\), is usually not considered enough to know where the true mean of a population is. It is seen as one instantiation of the random variable \(\overline{X}\), and since we interpret it as being the result of a random process, we would like to describe the uncertainty we associate with its position relative to the population mean. This is true not only of the sample mean but any statistic we estimate from a random sample.

The last lecture was concerned with computational methods for performing statistical inference. In this lecture, we consider procedures for inferential statistics derived from probability models. In particular, I focus on parametric methods, which aim to uncover information regarding the location of a parameter of a probability model assumed to describe the data, such as the mean \(\mu\) in the Normal distribution, or the population proportion \(p\). (The methods explored last lecture, such as bootstrapping, are non-parametric; they do not assume a probability model describing the underlying distribution of the data. MATH 3080 covers other non-parametric methods.)

Why use probabilistic methods when we could use computational methods for many procedures? First, computational power and complexity may limit the effectiveness of computational methods; large or complex data sets may be time consuming to process using computational methods alone. Second, methods relying on simulation, such as bootstrapping, are imprecise. Many times this is fine, but the price of imprecision is large when handling small-probability events and other contexts where precision is called for. Third, while computational methods are robust (deviation from underlying assumptions has little effect on the end result), this robustness comes at the price of power (the ability to detect effects, even very small ones), and this may not be acceptable to a practitioner (much of statistical methodology can be seen as a battle between power and robustness, which are often mutually exclusive properties). Furthermore, many computational methods can be seen as a means to approximate some probabilistic method that is for some reason inaccessible or not quite optimal.

A \(100C\)% confidence interval is an interval or region constructed in such a way that the probability the procedure used to construct the interval results in an interval containing the true population parameter of interest is \(C\). (Note that this is not the same as the probability that the parameter of interest is in the confidence interval, which does not even make sense in this context since the confidence interval computed from the sample is fixed and the parameter of interest is also considered fixed, albeit unknown; this is a common misinterpretation of the meaning of a confidence interval and I do not intend to spread it.) We call \(C\) the confidence level of the confidence interval. Smaller \(C\) result in intervals that capture the true parameter less frequently, while larger \(C\) result in intervals capturing the desired parameter more frequently. There is (always) a price, though; smaller \(C\) yield intervals that are more precise (and more likely to be wrong), while larger \(C\) yield intervals that are less precise (and less likely to be wrong). There is usually only one way to get more precise confidence intervals while maintaining the same confidence level \(C\): collect more data.

In this lecture we consider only confidence intervals for univariate data (or data that could be univariate). When discussing confidence intervals, the estimate is the estimated value of the parameter of interest, and the margin of error (which I abbreviate with “moe”) is the quantity used to describe the uncertainty associated with the estimate. The most common and familiar type of confidence interval is a two-sided confidence interval of the form:

\[\text{estimate} \pm \text{moe}\]

For estimates of parameters from symmetric distributions, such a confidence interval is called an equal-tail confidence interval (so called because the confidence interval is equally likely to err on either side of the estimate). For estimates for parameters from non-symmetric distributions, an equal-tail confidence interval may not be symmetric around the estimate. Also, confidence intervals need not be two-sided; one-sided confidence intervals, which effectively are a lower-bound or upper-bound for the location of the parameter of interest, may also be constructed.