Which Of The Following Is A Biased Estimator

Understanding Biased Estimators: A Key Concept in Statistical Inference

In the realm of statistics and data science, the quest for accurate parameter estimation is fundamental. When we collect a sample of data, our goal is to make inferences about the larger population from which that sample was drawn. We use statistics—functions of the sample data—as estimators for unknown population parameters. However, not all estimators are created equal. A critical property that distinguishes a good estimator from a poor one is bias. The question "which of the following is a biased estimator?" is a common one in exams and practical analysis, but its true answer lies in understanding the core definition and being able to apply it to any given set of options. This article will demystify the concept of bias, provide clear examples, and equip you with the framework to identify a biased estimator in any context.

What is an Estimator? Setting the Stage

Before defining bias, we must clarify what an estimator is. An estimator is a rule or a formula that tells us how to calculate an estimate of a population parameter based on sample data. It is a random variable because it depends on the random sample. For example:

The sample mean ($\bar{x}$) is an estimator for the population mean ($\mu$).
The sample variance ($s^2$) is an estimator for the population variance ($\sigma^2$).
The sample proportion ($\hat{p}$) is an estimator for the population proportion ($p$).

Each time we draw a new sample, our estimator will yield a slightly different value. The collection of these possible values forms the sampling distribution of the estimator.

Defining Bias: The Heart of the Matter

The bias of an estimator is formally defined as the difference between the expected value (mean) of the estimator's sampling distribution and the true, unknown value of the population parameter it is intended to estimate.

Bias = E(θ̂) - θ

Where:

θ̂ (theta-hat) represents the estimator.
E(θ̂) is the expected value or mean of the estimator's sampling distribution.
θ (theta) is the true population parameter.

An estimator is considered unbiased if its bias is exactly zero. This means that, on average, over an infinite number of samples of the same size from the population, the estimator will hit the true parameter value. It does not mean that any single estimate will be perfect, but there is no systematic tendency to overestimate or underestimate.

Conversely, an estimator is biased if its expected value is not equal to the true parameter. This indicates a systematic error:

If E(θ̂) > θ, the estimator is positively biased (tends to overestimate).
If E(θ̂) < θ, the estimator is negatively biased (tends to underestimate).

Classic Examples: Spotting Bias in Common Estimators

To answer "which of the following is a biased estimator?" you must calculate or recall the expected value of each option and compare it to the parameter it claims to estimate.

1. The Sample Variance: A Famous Case

This is the most classic example in introductory statistics. The sample variance is calculated in two common ways:

Biased Formula: $s_n^2 = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})^2$
Unbiased Formula: $s_{n-1}^2 = \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{x})^2$

The formula using n in the denominator is a biased estimator of the population variance $\sigma^2$. Its expected value is $E(s_n^2) = \frac{n-1}{n} \sigma^2$, which is slightly smaller than $\sigma^2$. It systematically underestimates the true variance because the sample mean $\bar{x}$ is itself estimated from the data, making the deviations $(x_i - \bar{x})$ slightly smaller on average than deviations from the true $\mu$. The formula using n-1 (Bessel's correction) is specifically designed to be unbiased.

2. The Sample Standard Deviation

Even the unbiased sample variance $s_{n-1}^2$ leads to a biased estimator for the population standard deviation $\sigma$. The sample standard deviation is $s = \sqrt{s_{n-1}^2}$. Because the square root function is nonlinear, $E(s) \neq \sqrt{E(s_{n-1}^2)} = \sigma$. The bias here is small for large n but persists. There is no simple unbiased formula for the standard deviation.

3. Maximum Likelihood Estimators (MLEs)

MLEs are popular due to their desirable asymptotic properties (they become efficient as sample size grows). However, they are not always unbiased in finite samples.

For a Normal distribution, the MLE for $\sigma^2$ is the biased version $\frac{1}{n} \sum (x_i - \bar{x})^2$.
For the parameter p of a Bernoulli/Binomial distribution, the MLE is the sample proportion $\hat{p}$, which is unbiased.
For the rate parameter λ of an Exponential distribution, the MLE is $1/\bar{x}$, which is biased (though a bias-corrected version exists).

4. Estimators from Non-Linear Transformations

If you apply a non-linear function to an unbiased estimator, the result is generally biased. For example, if $\bar{x}$ is unbiased for $\mu$, then $\bar{x}^2$ is a biased estimator for $\mu^2$.

5. Truncated or Censored Samples

Any estimator derived from a sample that is not a simple random sample from the full population is prone to selection bias. For instance, estimating average

estimating average income from a survey that unintentionally excludes the wealthiest respondents. Because the sample no longer represents the full distribution, the sample mean (\bar{x}) will tend to be lower than the true population mean (\mu); the bias is (\mathrm{E}[\bar{x}] - \mu < 0). Similar issues arise in censored data, such as reliability studies where failures beyond a certain test time are not observed. Estimating the exponential rate (\lambda) from right‑censored lifetimes using the naïve MLE (1/\bar{x}) (where (\bar{x}) now averages only the observed, uncensored times) overestimates (\lambda) because the long‑lived units are missing.

Beyond sampling defects, bias can be introduced deliberately to improve overall estimation quality. The bias‑variance tradeoff shows that a modest bias may be tolerated if it yields a substantial reduction in variance, leading to a lower mean‑squared error (MSE). Classic examples include:

Ridge regression – shrinking ordinary least‑squares coefficients toward zero introduces bias but often reduces prediction error when predictors are collinear.
James‑Stein estimator – for estimating a multivariate normal mean, the estimator (\hat{\theta}_{JS}= \left(1-\frac{p-2}{|\mathbf{x}|^2}\right)\mathbf{x}) is biased yet dominates the unbiased sample mean (\mathbf{x}) in terms of MSE for dimensions (p\ge 3).
Shrinkage estimators for variance – such as the Ledoit‑Wolf estimator, which pulls the sample covariance matrix toward a structured target (e.g., the identity matrix) to obtain a more stable, albeit biased, estimate of the true covariance.

These techniques illustrate that bias is not inherently undesirable; what matters is the estimator’s expected loss under the criterion of interest (often MSE or predictive error). In practice, analysts should:

Derive or simulate the expectation of the proposed estimator to quantify bias.
Assess variance (or more generally, the estimator’s sampling distribution) to understand the bias‑variance tradeoff.
Consider alternative loss functions—if the goal is prediction rather than parameter recovery, a biased estimator with lower prediction error may be preferred.
Use resampling methods (bootstrap, jackknife) or analytical bias corrections when a nearly unbiased estimator is required.

In summary, while the sample variance with denominator (n) and the sample standard deviation are textbook examples of biased estimators, bias appears in many guises: from MLEs under non‑linear transformations, to estimators built on truncated or censored samples, to purposefully shrunk estimators that sacrifice unbiasedness for improved overall performance. Recognizing the source and magnitude of bias, and weighing it against variance, enables statisticians to choose the most appropriate tool for their specific inferential or predictive task.