Expectation, Variance & Linearity

By the end of this page you'll compute E[X] and Var(X) for any distribution, and reuse their linearity rules to find E[aX+b] and Var(aX+b) without redoing the sum.

Predict: if you drag more weight onto x = 4, which way does the balance point E[X] move — and does the ±1 SD band get wider or narrower? Then drag a slider below to check. Weights auto-normalize to sum to 1; hover a bar for its exact value and probability.

Original distribution

E[X]–

E[X²]–

Var(X)–

SD(X)–

Linear transform aX + b

a b

E[aX+b]–

Var(aX+b)–

SD(aX+b)–

You'll use two numbers to summarize almost any distribution: the expectation $E[X]$, and the variance $\mathrm{Var}(X) = E[X^2] - (E[X])^2$. Both obey linearity rules under $aX+b$, so you never need to redo the sum from scratch.

Intuitive

Picture $E[X]$ as the long-run average you'd see if you repeated the experiment forever — it's the balance point of the distribution, exactly like the fulcrum in the interactive above. Standard deviation ($\mathrm{Var}(X)$'s square root) tells you how far outcomes typically stray from that balance point: a small value means results cluster near the mean, a large one means they're spread out and unpredictable — that's the green band around the fulcrum.

Formal

$E[X] = \sum_x x\,p(x)$ (discrete) or $\int x f(x)\,dx$ (continuous). For any function $g$, $E[g(X)] = \sum_x g(x)p(x)$ or $\int g(x)f(x)\,dx$ — this is the law of the unconscious statistician (LOTUS), and it's how you'll get $E[X^2]$ without finding $X^2$'s own distribution first. So $\mathrm{Var}(X) = E[(X-E[X])^2] = E[X^2] - (E[X])^2$. Linearity: $E[aX+b] = aE[X]+b$ for any constants $a, b$; but $\mathrm{Var}(aX+b) = a^2\mathrm{Var}(X)$ — shifting by $b$ doesn't change spread, scaling by $a$ stretches spread by $|a|$, hence the $a^2$ in variance.

Applied

Say a claim severity $X$ has $E[X] = \$2{,}000$, and you apply a 10% expense loading plus a flat \$50 fee. Your loaded premium is $Y = 1.1X + 50$, so $E[Y] = 1.1(2000) + 50 = \$2{,}250$. If $\mathrm{Var}(X) = 1{,}000{,}000$, then $\mathrm{Var}(Y) = 1.1^2 \times 1{,}000{,}000 = \$1{,}210{,}000$ — notice the \$50 fee never touches variance, only the 1.1 multiplier does.

Worked example

$X$ uniform on $\{1,\dots,6\}$ (fair die). $E[X] = 21/6 = 3.5$. $E[X^2] = 91/6$. $\mathrm{Var}(X) = 91/6 - 3.5^2 = $ 35/12 ≈ 2.9167.

Now you try — faded example

$X$ is uniform on $\{1,2,3,4\}$ (a fair 4-sided spinner). Find $E[X]$ and $\mathrm{Var}(X)$.

Step 1 — $E[X] = \dfrac{1+2+3+4}{4} = \dfrac{10}{4} = 2.5$.

Step 2 — $E[X^2] = \dfrac{1^2+2^2+3^2+4^2}{4} = \dfrac{30}{4} = 7.5$.

Step 3 — finish it: $\mathrm{Var}(X) = E[X^2] - (E[X])^2 = 7.5 - 2.5^2 = $

Reveal the answer

$7.5 - 6.25 = $ 1.25, so $\mathrm{SD}(X) = \sqrt{1.25} \approx 1.118$.

More info — why Var(aX+b) picks up a² but not b

Here's another way to see it: shifting every outcome by $b$ (adding a constant) slides the whole distribution sideways without changing how spread out it is relative to its own new mean — the balance point moves by $b$, but the distances from it don't change, so variance is unaffected. Scaling by $a$, though, stretches every distance-from-the-mean by a factor of $a$; since variance averages squared distances, that stretch gets squared too, giving $a^2$. Try it yourself: set $a=2, b=0$ in the interactive above and watch $\mathrm{Var}(aX+b)$ jump to 4× the original, while $a=1, b$ sliding anywhere leaves it untouched. The StatQuest and Harvard Stat 110 links under Dive deeper below both re-derive this more slowly.

Check your understanding

Question 1 of 4

X is uniform on {1,2,3,4,5,6} (a fair die). What are E[X] and Var(X)?

Question 2 of 4

If Var(X) = 4, what is Var(3X + 5)?

Question 3 of 4

Which statement about Var(X) is always true?

Question 4 of 4

X takes values 1, 2, 3 with P(X=1)=p, P(X=2)=2p, P(X=3)=3p. Using the fact that probabilities over the sample space must sum to 1, find E[X].

Recap

Expectation $E[X] = \sum_x x\,p(x)$ (or $\int x f(x)\,dx$) — the probability-weighted balance point.
Variance $\mathrm{Var}(X) = E[X^2] - (E[X])^2$ — average squared distance from that balance point; $\mathrm{SD}(X) = \sqrt{\mathrm{Var}(X)}$.
LOTUS: $E[g(X)] = \sum_x g(x)p(x)$ lets you compute $E[X^2]$ directly, no need for $X^2$'s own distribution.
Linearity: $E[aX+b] = aE[X]+b$, but $\mathrm{Var}(aX+b) = a^2\mathrm{Var}(X)$ — shifts ($b$) don't change spread, scaling ($a$) does, and it's squared.

Dive deeper

StatQuest — The Mean, Variance and Standard Deviation See how variance and standard deviation measure spread.
Seeing Theory — Basic Probability (Expectation) Play with expected value on a rollable die.
Harvard Stat 110 — Expectation (Blitzstein) Rigorous coverage of expectation and its linearity.

Sources

Expectation, Variance, and Linearity