SQL Window Functions, Explained Without the Headache

You have a table of sales, and your boss asks a deceptively simple question: "For each salesperson, show me their monthly total and how that compares to the month before." You know how to get the monthly total — a GROUP BY handles that. But the moment you need to compare one row to another row, the grouping falls apart. You reach for a subquery, then another, and suddenly your query is a tangle you're afraid to touch.

This is the exact moment window functions were built for. They let you run calculations across rows while keeping every individual row intact. Once the idea clicks, a whole class of "I guess I'll export this to a spreadsheet" problems disappears.

A window function performs a calculation over a set of rows related to the current row — without collapsing those rows into one.

The one idea that makes it all click

The difference between GROUP BY and a window function comes down to a single question: do you want fewer rows, or the same rows with more information?

GROUP BY reduces. Ten sales rows for one person become a single row with a total. That's great when you only want the summary. But you lose the detail — you can no longer see the individual sales next to that total.

A window function keeps every row and simply adds a column computed from a "window" of related rows. Ten sales rows stay ten rows, but now each one can show the running total, the person's overall total, or where that sale ranks. Nothing is thrown away. That single property is why window functions feel like a superpower once you start using them.

Anatomy of the OVER clause

Every window function has the same shape: a function, then the keyword OVER, then a definition of the window. The OVER clause is where the real logic lives, and it has two main levers — PARTITION BY and ORDER BY.

SELECT
    salesperson,
    sale_month,
    amount,
    SUM(amount) OVER (
        PARTITION BY salesperson
        ORDER BY sale_month
    ) AS running_total
FROM sales;

PARTITION BY is like GROUP BY for the window — it splits the rows into independent buckets. Here, each salesperson gets their own running total that resets when the next person begins. ORDER BY inside OVER controls the order in which the function walks through each partition, which matters enormously for anything cumulative. Leave out ORDER BY and SUM gives you the grand total for the whole partition on every row; add it and the same SUM becomes a running total. Same function, completely different meaning, decided by one clause.

Ranking rows: ROW_NUMBER, RANK, and DENSE_RANK

The most common real-world use of window functions is ranking. Say you want each department's top three earners, or you need to de-duplicate rows by keeping only the most recent one per customer. Three closely related functions handle this, and the difference between them trips up almost everyone at first.

SELECT
    department,
    employee,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num,
    RANK()       OVER (PARTITION BY department ORDER BY salary DESC) AS rank_val,
    DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_val
FROM employees;

Here's how they differ when two people tie. ROW_NUMBER never ties — it always produces 1, 2, 3, 4 even if two salaries are identical, picking an arbitrary winner. RANK gives ties the same number but then skips — two people at rank 1, and the next person is rank 3. DENSE_RANK also ties but doesn't skip — 1, 1, 2.

Function	Two people tie for 1st	Next person
`ROW_NUMBER`	1, 2 (arbitrary)	3
`RANK`	1, 1	3
`DENSE_RANK`	1, 1	2

A practical rule of thumb: use ROW_NUMBER when you need to pick exactly one row per group (like the latest order), RANK when gaps in the numbering are meaningful (leaderboards), and DENSE_RANK when you want clean consecutive tiers.

Looking at neighboring rows with LAG and LEAD

Back to that opening question — comparing each month to the month before. This is where LAG and LEAD shine. LAG reaches backward to a previous row; LEAD reaches forward to a later one. Both respect the window's ordering, so "previous" means whatever your ORDER BY says it means.

SELECT
    sale_month,
    amount,
    LAG(amount) OVER (ORDER BY sale_month) AS prev_month,
    amount - LAG(amount) OVER (ORDER BY sale_month) AS change
FROM monthly_sales;

That change column — this month minus last month — is the kind of thing people used to build in a spreadsheet with a manual formula, copying it down a thousand rows. Now the database does it in one pass. You can pass a second argument to jump further back (LAG(amount, 12) for year-over-year), and a third for a default value when there's no previous row, which spares you from awkward NULL handling.

A quick word on frames: ROWS and RANGE

Once you're comfortable with running totals, you'll eventually want more control over exactly which rows the window includes — say, a three-month moving average instead of a total from the very beginning. That's what a frame clause does. It sits inside OVER after the ORDER BY and defines a sliding window relative to the current row.

SELECT
    sale_month,
    amount,
    AVG(amount) OVER (
        ORDER BY sale_month
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS moving_avg_3mo
FROM monthly_sales;

ROWS BETWEEN 2 PRECEDING AND CURRENT ROW tells the database to average the current month and the two before it — a classic smoothing technique for noisy data. You don't need frames for basic ranking or a simple cumulative sum (the default frame already handles those), but knowing they exist means you're never stuck when a report calls for "the last N rows" instead of "everything so far."

Where people get tripped up

The single most common mistake is forgetting that window functions run after the WHERE clause. You cannot filter on a window function's result inside WHERE, because at that stage the window hasn't been computed yet. If you want "only the top row per group," you have to compute the ROW_NUMBER in a subquery or CTE first, then filter on it in an outer query.

WITH ranked AS (
    SELECT *,
        ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS rn
    FROM orders
)
SELECT * FROM ranked WHERE rn = 1;

The other frequent surprise is the ORDER BY inside OVER versus at the end of the query. They're independent. The one inside OVER shapes the calculation; the one at the end shapes the final display. It's perfectly normal to have both, and they can even use different columns. Keeping those two roles separate in your head clears up most of the confusion.

Bringing it together

Window functions reward a small mental shift: stop thinking about squishing rows together and start thinking about adding a computed column that looks sideways at related rows. Ranking, running totals, and period-over-period comparisons are the three patterns you'll reach for constantly, and they all share the same OVER skeleton.

The best way to internalize this is to take one query you currently solve with a messy subquery — a "most recent per customer," a "rank within category" — and rewrite it with a window function. It'll be shorter, it'll usually be faster, and the next time your boss asks for a comparison across rows, you won't reach for the export button. You'll just add one more line to your OVER clause.