Random Variables, pmf/pdf & the CDF
A random variable turns outcomes into numbers; its pmf or pdf says where probability sits, and its CDF — non-decreasing from 0 to 1 — accumulates it.
By the end you'll be able to tell whether a variable needs a pmf or a pdf, read probabilities off a CDF, and check whether a candidate F could even be valid.
Predict: before you draw, which face's bar do you expect to be tallest once you've rolled 100 times? Click "Draw 100" and see if you were right.
Roll a fair die and watch the sample build up. The left panel is the empirical pmf (how often each face actually appeared); the right panel is the empirical CDF climbing from 0 to 1. The dashed orange line marks the theoretical shape — draw enough and the blue converges to it. Hover or focus a bar for its exact count.
A random variable \(X\) maps outcomes to numbers. If \(X\) is discrete, it has a pmf; if it's continuous, a pdf. Either way it has a cumulative distribution function \(F(x)\) \(= P(X \le x)\) that's non-decreasing with limits 0 and 1.
Think of \(X\) as a labeling machine: you feed it whatever happened in the real world (a coin flip, a claim amount, a lifetime) and it hands back a number you can do arithmetic with. Ask \(F(x)\) "what's the chance the outcome is at most \(x\)?" — the answer only ever climbs (or stays flat) as you raise \(x\), because you're accumulating more probability, starting at 0 and ending at 1.
Discrete \(X\): pmf \(p(x) = P(X = x)\) with \(p(x) \ge 0\) and \(\sum_x p(x) = 1\); \(F(x) = \sum_{t \le x} p(t)\) is a step function. Continuous \(X\): pdf \(f(x)\) \(\ge 0\) with \(\int_{-\infty}^{\infty} f(x)\,dx = 1\), \(P(a \le X \le b) = \int_a^b f(x)\,dx\), \(P(X = x) = 0\) for any single point, and \(F(x) = \int_{-\infty}^{x} f(t)\,dt\) so \(f = F'\) wherever differentiable. The support of \(X\) is the set of values with nonzero probability (or density).
Say you're pricing an auto policy. You'd model annual claim count as discrete (0, 1, 2, … claims) and claim severity (dollar loss given a claim) as continuous. \(F(x)\) for severity tells you what fraction of claims are \(\le \$x\) — exactly what you need to set a deductible or a policy limit.
\(f(x) = 3x^2\) on \([0,1]\). Find \(P(X > 0.5)\).
Step 1: integrate to get the CDF. \(F(x) = \int_0^x 3t^2\,dt = \) ____.
Step 2: use the complement. \(P(X > 0.5) = 1 - F(0.5) = \) ____.
Check your steps
Step 1: \(F(x) = x^3\) (power rule on the integral, or recognize \(f = F'\) checks out: \(\frac{d}{dx}x^3 = 3x^2\)).
Step 2: \(F(0.5) = 0.5^3 = 0.125\), so \(P(X > 0.5) = 1 - 0.125 = \) 0.875.
More info — reading a CDF backwards
You can also run \(F\) in reverse: given a probability \(p\), the value \(x\) with \(F(x) = p\) is the \(p\)-th quantile of \(X\) (the median is the 0.5 quantile). This is exactly what the empirical CDF panel above is building up to — once you've drawn enough rolls, you can read "where does the curve cross 0.5?" off the chart instead of computing it. The Seeing Theory link in Dive deeper below lets you watch this happen for several named distributions, not just the die.
Check your understanding
X has CDF F(x) = P(X ≤ x). Which property must F always satisfy?
X takes values 1, 2, and 3 with P(X=1) = 0.2, P(X=2) = 0.3, and P(X=3) = k. What must k equal for this to be a valid pmf?
Two fair dice are rolled and X is their sum. There are 36 equally likely outcomes; how many give X = 7, and what is P(X = 7)?
A random variable \(X\) assigns a number to every outcome. Discrete \(X\) has a pmf \(p(x) = P(X=x)\) that sums to 1; continuous \(X\) has a pdf \(f(x)\) that integrates to 1 and gives \(P(X=x)=0\) at any single point. Either way, the CDF \(F(x) = P(X \le x)\) is non-decreasing, runs from 0 to 1, and lets you get any interval probability as \(P(a \le X \le b) = F(b) - F(a)\).
Dive deeper
- StatQuest — Probability Distributions (PMF, PDF, CDF) Tell PMFs, PDFs, and CDFs apart at a glance.
- Seeing Theory — Probability Distributions Interactively watch a CDF build up from a distribution.
- MIT 6.012 — Definition of Random Variables A precise lecture definition of a random variable.
Sources
- Random Variables, pmf/pdf, and the CDF