Sums of Independent Variables & the CLT

Independent sums add their means and variances and multiply their MGFs — and no matter how skewed each piece is, the standardized sum drifts toward the bell curve as n grows.

By the end you'll be able to find the mean, variance, and normal approximation for a sum of independent variables, and explain why the CLT works no matter how skewed the underlying distribution is.

Predict: before you draw, guess what happens to the spread of the histogram below if you push n from 30 up toward 100 — does it get wider, narrower, or stay the same? Then move the slider and check the SE readout.

Pick a base distribution (deliberately non-normal), choose a sample size n, then draw 2,000 independent trials. Each trial averages n draws from the base distribution; the histogram shows how those trial means pile up. The dashed orange curve is the theoretical Normal(μ, σ²/n) — watch the blue histogram hug it more tightly as n grows, regardless of how lumpy or skewed the base distribution is.

base mean μ = base σ = sampling-distribution SE = σ/√n =
Distribution of the sample mean across 2,000 trials, each averaging n draws
sample means (2,000 trials) theoretical Normal(μ, σ²/n)

For independent variables, means and variances add and MGFs multiply; the Central Limit Theorem says the standardized sum of n i.i.d. variables converges to the standard normal, justifying normal approximation for large n.

Intuitive

Add up enough i.i.d. "noisy" contributions — no matter their individual shape — and the bumps and skew wash out, leaving a smooth bell curve. This is why so many real-world totals (heights, measurement errors, aggregate claims) look approximately normal.

Formal

\(E[S_n] = \sum E[X_i]\), \(\operatorname{Var}(S_n) = \sum \operatorname{Var}(X_i)\), and \(M_{S_n}(t) = \prod M_{X_i}(t)\) for independent \(X_i\), where \(M_X(t) = E[e^{tX}]\) is X's moment generating function (MGF). The Central Limit Theorem: the standardized sum \(Z_n = (S_n - n\mu)/(\sigma\sqrt{n}) \to N(0,1)\) as \(n \to \infty\) — this is convergence in the shape of the sampling distribution, not a claim that any finite sum is exactly normal.

Applied

You're pricing a book of n = 100 independent policies, each with mean 500 and variance 40,000. The CLT lets you model total claims S as approximately Normal(50,000, 4,000,000) — so you can compute \(P(S > \text{threshold})\) with a normal table instead of wrestling with the true (unknown, messy) exact distribution of the sum.

Worked example

n = 100 i.i.d. claims, \(\mu = 500\), \(\sigma = 200\). \(E[S] = 50{,}000\), \(\operatorname{Var}(S) = 100(200^2) = 4{,}000{,}000\), \(\sigma_S = 2{,}000\). \(P(S > 53{,}000) \approx P(Z > 1.5) \approx\) 0.0668.

Your turn

Same 100 claims, but now find \(P(S < 47{,}000)\). You already have \(\mu_S = 50{,}000\) and \(\sigma_S = 2{,}000\) from above. Standardize:

\(Z = (47{,}000 - 50{,}000)/\) ____ \(=\) ____

Reveal the answer

\(Z = (47{,}000 - 50{,}000)/2{,}000 = -1.5\), so \(P(S < 47{,}000) \approx P(Z < -1.5) \approx\) 0.0668 — the same tail probability as the worked example above, just mirrored on the low side.

More info — why independence is doing all the work

Both rules here — variances adding and MGFs multiplying — lean entirely on independence. Recall from the covariance lesson that \(\operatorname{Var}(X+Y) = \operatorname{Var}(X) + \operatorname{Var}(Y) + 2\operatorname{Cov}(X,Y)\) in general. Independent variables have \(\operatorname{Cov}(X,Y) = 0\), so that last term vanishes and variances simply add. If your claims were correlated (a hurricane hitting many policies at once, say), you'd need the full covariance formula, and the plain CLT statement here wouldn't apply without adjustment. The 3Blue1Brown link in Dive deeper below animates why the sum's shape converges to Gaussian regardless of the base distribution.

Check your understanding

Question 1 of 4

An insurer sums n=100 i.i.d. independent claim amounts, each with mean 500 and variance 40,000. By the Central Limit Theorem, the total claims S is approximately:

Question 2 of 4

You sum n=100 i.i.d. claims, each highly right-skewed (most claims small, a few huge). Why is it still valid to approximate the total S as normal?

Question 3 of 4

X and Y are independent with Var(X)=4 and Var(Y)=9. Using Var(X+Y) = Var(X) + Var(Y) + 2Cov(X,Y), what is Var(X+Y), and why does the covariance term disappear?

Question 4 of 4

For independent random variables X and Y, how do their moment generating functions combine for the sum X+Y, and why does this technique work?

Recap

  • For independent \(X_i\): means add, \(E[S_n] = \sum E[X_i]\); variances add, \(\operatorname{Var}(S_n) = \sum \operatorname{Var}(X_i)\); and MGFs multiply, \(M_{S_n}(t) = \prod M_{X_i}(t)\).
  • The Central Limit Theorem: the standardized sum \(Z_n = (S_n - n\mu)/(\sigma\sqrt{n})\) converges to \(N(0,1)\) as \(n \to \infty\), whatever shape the individual \(X_i\) have.
  • This justifies treating \(S_n\) as approximately Normal(\(n\mu\), \(n\sigma^2\)) for large n — an approximation, not an exact statement for finite n.
  • Variances only add cleanly like this because independence makes the covariance term zero — with correlated variables you'd need the full \(\operatorname{Var}(X+Y) = \operatorname{Var}(X)+\operatorname{Var}(Y)+2\operatorname{Cov}(X,Y)\) formula instead.

Dive deeper

Sources

  • Sums of Independent Variables and the Central Limit Theorem