Moment Generating Functions

\(M_X(t) = E[e^{tX}]\) packs every moment of X into one function — differentiate at 0 to pull them out, and multiply MGFs to handle sums of independent variables.

By the end you'll be able to pull \(E[X]\) and \(\text{Var}(X)\) out of an MGF by differentiating, and use the product rule to find the MGF of a sum of independent variables.

Predict: if you drag λ up — a faster-decaying exponential with a smaller mean — does the tangent's slope at t = 0 get steeper or flatter? Drag the slider below to check.

This is \(M_X(t) = \lambda/(\lambda-t)\) for the Exponential(λ). Drag λ to reshape the curve and watch it blow up as t approaches λ — that's the edge of where the MGF converges. The dashed tangent line at t = 0 has slope \(M_X'(0) = E[X]\) — read its value right next to the point on the diagram. Hover the curve or either point for exact readouts.

Rate λ = 1.00

t = 0.00

M_X(t) = 1.000 · tangent slope M_X′(0) = 1/λ = 1.000 = E[X]

M_X(t) = λ/(λ−t) — curve, tangent at 0, and asymptote at t = λ

M_X(t) tangent at 0 t = λ asymptote

The moment generating function \(M_X(t) = E[e^{tX}]\) generates moments via \(M_X^{(n)}(0) = E[X^n]\) and uniquely determines a distribution, making it the standard tool for identifying sums of independent random variables.

Formal

\(M_X(t) = E[e^{tX}]\), defined wherever the sum/integral converges near \(t=0\). Differentiate n times and set \(t=0\) and you recover raw moments: \(M_X'(0) = E[X]\), \(M_X''(0) = E[X^2]\), so \(\text{Var}(X) = M_X''(0) - [M_X'(0)]^2\) — that's the tangent slope at t = 0 you saw in the diagram above. If two variables share the same MGF on an interval around 0, they share the same distribution — that's uniqueness, and it's what lets you identify a sum's distribution just by multiplying MGFs. For independent \(X, Y\): \(M_{X+Y}(t) = M_X(t)\cdot M_Y(t)\).

Applied

Picture an insurer modeling total claims as the sum of many independent claim amounts. Convolving densities directly is hard, so instead you multiply their MGFs: if each claim is Exponential(λ), the sum of \(n\) independent claims is Gamma(\(n\), λ) — you can read that off instantly from the product \([\lambda/(\lambda-t)]^n\), which matches the Gamma MGF form.

Worked example

\(X \sim\) Exponential(λ): \(M_X(t) = \lambda/(\lambda - t)\), \(t < \lambda\). \(M_X'(0) = 1/\lambda = E[X]\). \(M_X''(0) = 2/\lambda^2 = E[X^2]\). \(\text{Var}(X) = 2/\lambda^2 - 1/\lambda^2 = \) \(1/\lambda^2\).

Your turn

Same family, a different rate: \(X \sim\) Exponential(λ=2). \(M_X(t) = 2/(2-t)\), so \(M_X'(t) = 2/(2-t)^2\) and \(M_X'(0) = 2/4 = 0.5 = E[X]\). \(M_X''(t) = 4/(2-t)^3\), so \(M_X''(0) = 4/8 = 0.5 = E[X^2]\). Now finish it: \(\text{Var}(X) = E[X^2] - (E[X])^2 = 0.5 - \) ____

Reveal the answer

\(\text{Var}(X) = 0.5 - 0.5^2 = 0.5 - 0.25 = \) 0.25 — matching the general formula \(1/\lambda^2 = 1/2^2 = 0.25\). Set λ = 2 on the slider above and check it against the tangent point's tooltip.

More info — why differentiating pulls out moments

Expand \(e^{tX}\) as a Taylor series: \(e^{tX} = 1 + tX + \dfrac{(tX)^2}{2!} + \dfrac{(tX)^3}{3!} + \cdots\). Take the expectation term by term (linearity): \(M_X(t) = 1 + tE[X] + \dfrac{t^2}{2!}E[X^2] + \cdots\). That's just a power series in t whose coefficients are the moments, so differentiating n times and setting t = 0 peels off the n-th moment's coefficient — exactly \(M_X^{(n)}(0) = E[X^n]\). See the Harvard Stat 110 link in Dive deeper below for the full derivation.

Check your understanding

Question 1 of 4

Which property makes the moment generating function useful for sums of independent random variables?

Question 2 of 4

If \(M_X(t) = \lambda/(\lambda-t)\), how do you find \(E[X]\)?

Question 3 of 4

For ANY valid moment generating function \(M_X(t)\), what must \(M_X(0)\) equal?

Question 4 of 4

\(X\) has MGF \(M_X(t) = \lambda/(\lambda-t)\), with \(M_X'(0) = 1/\lambda\) and \(M_X''(0) = 2/\lambda^2\). Using \(\text{Var}(X) = E[X^2] - (E[X])^2\), what is \(\text{Var}(X)\)?

Recap

\(M_X(t) = E[e^{tX}]\); differentiate n times and set t = 0 to get the n-th raw moment: \(M_X^{(n)}(0) = E[X^n]\).
\(E[X] = M_X'(0)\), and \(\text{Var}(X) = M_X''(0) - [M_X'(0)]^2\).
Two distributions with the same MGF near t = 0 are the same distribution (uniqueness) — the standard way to identify a sum's distribution.
For independent X and Y: \(M_{X+Y}(t) = M_X(t)\cdot M_Y(t)\).

Dive deeper

StatQuest — Moment Generating Functions An intuitive first look at what an MGF does.
Harvard Stat 110 — Lecture 17: Moment Generating Functions See moments extracted from an MGF rigorously.
jbstatistics — The Moment Generating Function Work MGF examples for the named distributions.

Sources