Joint Distributions, Marginals & Independence
A joint density spreads probability over pairs (x, y); marginals integrate the other variable away, and independence is the special case where the joint splits cleanly into a product of those marginals.
By the end you'll be able to compute a marginal from a joint density, write a conditional density, and tell independence from dependence by checking whether the joint factors.
Predict: if you switch from the rectangle to the triangle support below, will the top and right strips still multiply back to the heatmap? Switch it and check the verdict.
The heatmap is the joint density f(x,y) on the unit square. The strip above sums out y to give the marginal fX(x); the strip to the right sums out x to give fY(y). Toggle the support below, hover a cell or a marginal strip for its exact value, or drag the sliders to probe a point with the keyboard.
You've got a joint distribution when you describe two or more variables together; marginals integrate/sum out the others, and the conditional density \(f(y|x) = f(x,y)/f_X(x)\) lets you condition on one variable to describe the other. Independence holds iff the joint factors as the product of marginals.
Picture a joint density as a landscape over the (x, y) plane — taller terrain means more likely combinations. Squash the whole landscape flat along the y-axis (add up all the "y-slices" at each x) and you get the marginal fX(x): the shape of X's distribution alone, forgetting Y ever existed. That's exactly what the top strip in the interactive is doing.
\(f_X(x) = \int f(x,y)\,dy\) and \(f(y|x) = f(x,y) / f_X(x)\). You get independence exactly when \(f(x,y) = f_X(x)\cdot f_Y(y)\) for all \(x, y\) — and that requires the support itself to be a product (typically a rectangle); if the support couples \(x\) and \(y\) (e.g. \(0 < x < y < 1\)), independence is impossible no matter how the density factors algebraically — you saw this in the "coupled" toggle above.
Say you're an insurer modeling loss severity Y jointly with a policyholder attribute X (e.g. vehicle age). The marginal fY(y) prices the "average" policy; the conditional \(f(y|x)\) supports rate segmentation — pricing differently once you know X. Check whether \(f(x,y)\) factors and you'll know whether segmentation actually changes the loss model at all.
Same joint density, \(f(x,y) = 6xy^2\). You already found \(f_X(x) = 2x\). Use the conditional-density formula to find \(f(y|x)\):
\(f(y|x) = \dfrac{f(x,y)}{f_X(x)} = \dfrac{6xy^2}{2x} = \) ____
Reveal the answer
\(f(y|x) = 3y^2\) — the \(x\) cancels, so \(f(y|x) = f_Y(y)\) exactly. That's the signature of independence: knowing X told you nothing new about Y's distribution. Compare this to the "coupled" support in the interactive, where \(f(y|x)\) would still depend on x.
More info — why checking the support matters as much as the algebra
It's tempting to just check whether \(f(x,y)\) factors into two single-variable pieces and call it done. But look back at the "coupled" toggle in the interactive: \(f(x,y) = 6x\) on \(0 < x < y < 1\) still looks like it only depends on \(x\), yet X and Y are not independent, because the triangular support means the range of valid \(x\) depends on \(y\) (and vice versa). Always check the support is a product region (a rectangle, here) before you trust the algebra. The MIT 14.310x link in Dive deeper below walks through more examples of this exact trap.
Check your understanding
Two random variables X and Y have joint density f(x,y) = 6xy² on 0<x<1, 0<y<1. What must be true if this joint density factors as f_X(x)·f_Y(y) for the given support?
Given a joint density f(x,y) defined on 0<x<y<1 (a triangular support), can X and Y be independent?
With f(x,y) = 6xy² on 0<x<1, 0<y<1, you derived f_Y(y) = 3y². What is f_Y(0.5)?
Recall from conditional independence of events: A and B are independent iff P(A∩B) = P(A)·P(B). For X and Y with f(x,y) = 6xy² (independent, rectangular support), let A = {X > 0.5} and B = {Y > 0.5}. Are A and B independent events?
Recap
- A joint density/PMF \(f(x,y)\) describes two variables together; a marginal \(f_X(x) = \int f(x,y)\,dy\) integrates the other variable out.
- The conditional density is \(f(y|x) = f(x,y)/f_X(x)\).
- X and Y are independent iff \(f(x,y) = f_X(x)\cdot f_Y(y)\) for every \((x,y)\) and the support is a product region — a coupled support (like \(x < y\)) rules out independence regardless of the algebra.
Dive deeper
- Harvard Stat 110 — Lecture 19: Joint, Conditional, and Marginal Distributions Connect joint, marginal, and conditional distributions.
- MIT 6.012 — From the Joint to the Marginal See marginals obtained by summing out a variable.
- MIT 14.310x — Joint, Marginal, and Conditional Distributions Test joint independence by factoring into marginals.
Sources
- Joint Distributions, Marginals, and Independence