Moment Generating Functions
\(M_X(t) = E[e^{tX}]\) packs every moment of X into one function — differentiate at 0 to pull them out, and multiply MGFs to handle sums of independent variables.
By the end you'll be able to pull \(E[X]\) and \(\text{Var}(X)\) out of an MGF by differentiating, and use the product rule to find the MGF of a sum of independent variables.
Predict: if you drag λ up — a faster-decaying exponential with a smaller mean — does the tangent's slope at t = 0 get steeper or flatter? Drag the slider below to check.
This is \(M_X(t) = \lambda/(\lambda-t)\) for the Exponential(λ). Drag λ to reshape the curve and watch it blow up as t approaches λ — that's the edge of where the MGF converges. The dashed tangent line at t = 0 has slope \(M_X'(0) = E[X]\) — read its value right next to the point on the diagram. Hover the curve or either point for exact readouts.
M_X(t) = 1.000 · tangent slope M_X′(0) = 1/λ = 1.000 = E[X]
The moment generating function \(M_X(t) = E[e^{tX}]\) generates moments via \(M_X^{(n)}(0) = E[X^n]\) and uniquely determines a distribution, making it the standard tool for identifying sums of independent random variables.
\(M_X(t) = E[e^{tX}]\), defined wherever the sum/integral converges near \(t=0\). Differentiate n times and set \(t=0\) and you recover raw moments: \(M_X'(0) = E[X]\), \(M_X''(0) = E[X^2]\), so \(\text{Var}(X) = M_X''(0) - [M_X'(0)]^2\) — that's the tangent slope at t = 0 you saw in the diagram above. If two variables share the same MGF on an interval around 0, they share the same distribution — that's uniqueness, and it's what lets you identify a sum's distribution just by multiplying MGFs. For independent \(X, Y\): \(M_{X+Y}(t) = M_X(t)\cdot M_Y(t)\).
Picture an insurer modeling total claims as the sum of many independent claim amounts. Convolving densities directly is hard, so instead you multiply their MGFs: if each claim is Exponential(λ), the sum of \(n\) independent claims is Gamma(\(n\), λ) — you can read that off instantly from the product \([\lambda/(\lambda-t)]^n\), which matches the Gamma MGF form.
\(X \sim\) Exponential(λ): \(M_X(t) = \lambda/(\lambda - t)\), \(t < \lambda\). \(M_X'(0) = 1/\lambda = E[X]\). \(M_X''(0) = 2/\lambda^2 = E[X^2]\). \(\text{Var}(X) = 2/\lambda^2 - 1/\lambda^2 = \) \(1/\lambda^2\).
Same family, a different rate: \(X \sim\) Exponential(λ=2). \(M_X(t) = 2/(2-t)\), so \(M_X'(t) = 2/(2-t)^2\) and \(M_X'(0) = 2/4 = 0.5 = E[X]\). \(M_X''(t) = 4/(2-t)^3\), so \(M_X''(0) = 4/8 = 0.5 = E[X^2]\). Now finish it: \(\text{Var}(X) = E[X^2] - (E[X])^2 = 0.5 - \) ____
Reveal the answer
\(\text{Var}(X) = 0.5 - 0.5^2 = 0.5 - 0.25 = \) 0.25 — matching the general formula \(1/\lambda^2 = 1/2^2 = 0.25\). Set λ = 2 on the slider above and check it against the tangent point's tooltip.
More info — why differentiating pulls out moments
Expand \(e^{tX}\) as a Taylor series: \(e^{tX} = 1 + tX + \dfrac{(tX)^2}{2!} + \dfrac{(tX)^3}{3!} + \cdots\). Take the expectation term by term (linearity): \(M_X(t) = 1 + tE[X] + \dfrac{t^2}{2!}E[X^2] + \cdots\). That's just a power series in t whose coefficients are the moments, so differentiating n times and setting t = 0 peels off the n-th moment's coefficient — exactly \(M_X^{(n)}(0) = E[X^n]\). See the Harvard Stat 110 link in Dive deeper below for the full derivation.
Check your understanding
Which property makes the moment generating function useful for sums of independent random variables?
If \(M_X(t) = \lambda/(\lambda-t)\), how do you find \(E[X]\)?
For ANY valid moment generating function \(M_X(t)\), what must \(M_X(0)\) equal?
\(X\) has MGF \(M_X(t) = \lambda/(\lambda-t)\), with \(M_X'(0) = 1/\lambda\) and \(M_X''(0) = 2/\lambda^2\). Using \(\text{Var}(X) = E[X^2] - (E[X])^2\), what is \(\text{Var}(X)\)?
Recap
- \(M_X(t) = E[e^{tX}]\); differentiate n times and set t = 0 to get the n-th raw moment: \(M_X^{(n)}(0) = E[X^n]\).
- \(E[X] = M_X'(0)\), and \(\text{Var}(X) = M_X''(0) - [M_X'(0)]^2\).
- Two distributions with the same MGF near t = 0 are the same distribution (uniqueness) — the standard way to identify a sum's distribution.
- For independent X and Y: \(M_{X+Y}(t) = M_X(t)\cdot M_Y(t)\).
Dive deeper
- StatQuest — Moment Generating Functions An intuitive first look at what an MGF does.
- Harvard Stat 110 — Lecture 17: Moment Generating Functions See moments extracted from an MGF rigorously.
- jbstatistics — The Moment Generating Function Work MGF examples for the named distributions.
Sources
- Moment Generating Functions