A window function is a feature developed in PostgreSQL — available since version 8.4 — to analyze data beyond the current row (hence the term “window”).
These windows can aggregate information to each row of your output. Apart from aggregate functions and groupings, window functions provide another way to perform calculations based on the values of several records.
Using window functions removes the hassle of using subqueries and
JOINS to aggregate neighboring rows that can be related to the
This tutorial will show you how aggregate functions are used as window
functions. You will learn how to use
RANK() , and
DENSE_RANK() as window functions. But first, you need to know why you
should use them.
✅ You can use the CoderPad sandbox in a new browser window to run the queries in this tutorial — the table is already loaded for you!
Why use window functions
To get an idea of why you should use window functions, let’s start with an example database:
Using this data, we want to get a breakdown of salaries for each employee in every department. We want to analyze the max salary across each department to compare it with each employee record.
To get that query result, you would probably use a Common Table
(CTE) with a
JOIN statement like the following:
WITH dep_stats AS ( SELECT dep_name, max(salary) max_salary FROM emp_salaries GROUP BY dep_name ) SELECT e.*, d.max_salary FROM emp_salaries e LEFT OUTER JOIN dep_stats d ON d.dep_name = e.dep_name;
That query will give you this output:
But a better way to get the same result is to use a window function like the following:
SELECT e.*, MAX(salary) OVER(PARTITION BY dep_name) max_salary FROM emp_salaries e;
As you can see, using a window function makes writing complex SQL queries easy. It allows you to output the current rows and add aggregate values accordingly.
ℹ️ Notice that window functions allowed us to perform aggregation without using the
GROUP BYclause, as in the case of aggregate functions.
Create a sample table
In this tutorial, you can experiment with your own table or create a sample table that we will borrow from the PostgreSQL documentation with a simple tweak. If you’re doing this tutorial from your own IDE, you can create your sample table by running the following script:
DROP TABLE IF EXISTS emp_salaries; CREATE TABLE emp_salaries ( emp_id SERIAL PRIMARY KEY, emp_name VARCHAR(30), dep_name VARCHAR(40), salary DECIMAL ); INSERT INTO emp_salaries VALUES (11, 'Ahmed', 'Engineering', 5200); INSERT INTO emp_salaries VALUES (7, 'Ali', 'Engineering', 4200); INSERT INTO emp_salaries VALUES (9, 'Ibrahim', 'Engineering', 4500); INSERT INTO emp_salaries VALUES (8, 'Mohamed', 'Engineering', 6000); INSERT INTO emp_salaries VALUES (10, 'Ayman', 'Engineering', 5200); INSERT INTO emp_salaries VALUES (5, 'Moemen', 'HR', 3500); INSERT INTO emp_salaries VALUES (2, 'Moetaz', 'HR', 3900); INSERT INTO emp_salaries VALUES (3, 'Abdullah', 'Sales', 4800); INSERT INTO emp_salaries VALUES (1, 'Assem', 'Sales', 5000); INSERT INTO emp_salaries VALUES (4, 'Omar', 'Sales', 4800);
⚠️ If you’re testing in your own environment, you’ll drop a table of the
emp_salariesname if it does exist.
Alternatively if you want to skip the table creation task, you can use the CoderPad sandbox to run the rest of the tutorial, as the sandbox already has the table loaded for you.
PostgreSQL window function syntax
The syntax of a window function is simple; it looks something like this:
<function_name>(<argument(s)>) OVER(PARTITION BY <column(s)> ORDER BY <column(s)>) <alias>
<function_name> is an aggregate function like
etc. This function can take a column as an argument and can also take
other arguments depending on the use case.
OVER() clause defines the window size. If it’s empty and
PARTITION BY and
ORDER BY are omitted, it means the window is
across the whole table. However,
PARTITION BY partitions the
window by the column(s) that follow the clause. In addition,
ORDER BY sorts the partitioned window by the column(s) indicated by
ORDER BY .
PostgreSQL window functions examples
To get an idea of how to best use window functions, we’ll start with a simple SQL query using an aggregate function:
SELECT MAX(salary) max_salary FROM emp_salaries;
This query outputs the maximum salary across all the records.
What if you want this max value next to each record in the table?
You might say this is easy; just add
* next to the
max_salary column. However, that query would create the following error:
This means that the first column of the
emp_salaries table —
emp_id column must appear in the
GROUP BY clause or be used in
an aggregate function.
To avoid this error, you have two options:
- A long route using a CTE along with a
- A shorter route using a window function.
The shorter (and preferred) option for us is to take the window function route and write the following query:
SELECT *, MAX(salary) OVER() max_salary FROM emp_salaries;
OVER() , you can use
MAX() as a window function to
aggregate the maximum salary to each record in the table:
GROUP BY vs. PARTITION BY clauses
What if you want a breakdown of max salaries across each department?
SELECT MAX(salary) max_salary FROM emp_salaries e GROUP BY dep_name;
This query gets only the maximum salary across each department:
Here, there is one column that represents the max salary per department. Again you wouldn’t easily be able to combine the other columns with the max_salary column unless you use window functions.
To compare each record with its corresponding maximum salary by
PARTITION BY :
SELECT *, MAX(salary) OVER(PARTITION BY dep_name) max_salary FROM emp_salaries;
This query results in the following records:
As you can see, the aggregate is done over the department names. So you
get a breakdown of the max salary of each department next to each record
If you’re still confused about the difference between
GROUP BY and
PARTITION BY , remember this:
GROUP BY reduces the number
of rows returned by
rolling them up to calculate the aggregate function for each row.
PARTITION BY partitions the window based on a specific column; thus
it doesn’t affect the number of rows in the result.
MAX() , MIN() , AVG() , SUM() , COUNT() functions
It’s important to note that the previous examples that you used with
MAX() can also be applied with either
SUM() , or
COUNT() . These are other aggregate functions and can also be used
as window functions as follows:
SELECT emp_id, salary, MAX(salary) OVER(PARTITION BY dep_name) max_salary, MIN(salary) OVER(PARTITION BY dep_name) min_salary, SUM(salary) OVER(PARTITION BY dep_name) total_salaries, COUNT(*) OVER(PARTITION BY dep_name) emp_count FROM emp_salaries;
which outputs the following result:
While we don’t cover
AVG()here, you can find more information on it in the PostgreSQL docs here .
ROW_NUMBER() is another useful window function. It assigns
serial numbers to records according to the window. See the following
SELECT *, MAX(salary) OVER( PARTITION BY dep_name ) max_salary, ROW_NUMBER() OVER() row_num FROM emp_salaries;
As you can see, the row_num column contains an ordered list of numbers
across the whole table. That’s because the window here has boundaries
across the table, as the
OVER() clause doesn’t have arguments.
A practical use case for
ROW_NUMBER() is to get the most senior
employee in each department. Try it on your own and see what you get.
SELECT *, MAX(salary) OVER(PARTITION BY dep_name) max_salary, ROW_NUMBER() OVER(PARTITION BY dep_name) seniority FROM emp_salaries;
Notice that in this case we partition by the department name. That partition would give you a result like this:
As you can see, there is an ordered list according to each department.
The engineering department has numbers
5 . HR:
2 . Sales:
PARTITION BY and ORDER BY clauses
A window function can have
ORDER BY inside an
To see what that looks like, assume that
seniority and that an
9 , meaning
that the former joined the company before the latter. We will then sort
according to the
emp_id to see the first employees who joined each
SELECT *, MAX(salary) OVER( PARTITION BY dep_name ) max_salary, ROW_NUMBER() OVER( PARTITION BY dep_name ORDER BY emp_id ) seniority FROM emp_salaries;
The seniority column represents ordered numbers according to the
emp_id . If you filter by seniority with
1 values, you get the
first employee for each department who joined the company.
Give it a try here using a subquery:
SELECT * FROM ( SELECT *, MAX(salary) OVER( PARTITION BY dep_name ) max_salary, ROW_NUMBER() OVER( PARTITION BY dep_name ORDER BY emp_id ) seniority FROM emp_salaries ) e WHERE e.seniority < 2;
RANK() window function returns the rank of the current row
with gaps. But what do we mean by gaps?
SELECT *, RANK() OVER( PARTITION BY dep_name ORDER BY salary DESC ) rank FROM emp_salaries;
This query returns the ranks of every employee, in descending order, according to each department. Gaps exist when there are duplicate values:
As you can see,
Ahmed share the same rank (
2 ) on engineering department salaries. Rank
3 is a gap, and the
RANK() function skips it. The next rank is
4 taken by
DENSE_RANK() window function is similar to
that it returns the rank without gaps. If you replace
DENSE_RANK() in the previous query, you’ll find a result query
like the following:
3 is not skipped, as in the case of
Window functions can make you productive. You will spend less time writing your queries with fewer lines of SQL.
In this tutorial, you saw how to use them in PostgreSQL. You learned how
RANK , and
DENSE_RANK . There are also other
window functions that you can check out in PostgreSQL