SQL Having

The having clause is used when filtering has to be performed on aggregate functions since the where clause can not be used in this situation. Since the having clause is often used in conjunction with the group by clause the examples on this page use the same data as was used in the group by article and follow on directly from that article.

Having generated a set of results showing the best customer on each particular day we decide that what we actually really want to do is just reward those customers that spent over a certain amount on a each day. The first thing most people try is the where clause but this doesn’t work with aggregate functions. This statement, for example, will cause an error:

SELECT Customer, Date, SUM(OrderTotal) FROM orders 
GROUP BY Customer, Date WHERE SUM(OrderTotal) > 100;

What you need is to replace the where with a having clause which is specifically designed to work with aggregate functions:

SELECT Customer, Date, SUM(OrderTotal) FROM orders 
GROUP BY Customer, Date HAVING SUM(OrderTotal) > 100;

This query produces these results:

Customer Date SUM(OrderTotal)
Alice 2012-02-25 225
Dave 2012-02-05 150

So, we find out that we only have two customers that we need to reward.

Why Won’t Where Work?

Where can’t work with aggregate functions because of when it’s clauses get processed during query execution. Where filters the result set before the aggregation takes place, having filters the result set after aggregation takes place.

Imagine in our example above we only wanted to reward those people who has placed large orders before 2010-02-20. We need to filter out all the rows before that date but at the same time still group together the remaining rows and then filter them. The query we need looks like this:

SELECT Customer, Date, SUM(OrderTotal) FROM orders 
WHERE Date < '2012-02-20'
GROUP BY Customer, Date HAVING SUM(OrderTotal) > 100;

Only the Dave row from the previous set of results is returned.

You can almost think of this as being two queries in one. The first query is everything up to an including the where clause which removes all the rows before the given date. The second query then groups together and filters those rows to provide the final results. The row for Alice on 2012-02-25 doesn’t show up in the results because the first part of the query had already removed the rows which her large purchases.

As you can see the where clause can’t work with the aggregate function results because the result of the aggregation can’t be known until after the where clause has been executed.

It’s worth pointing out that sometimes it’s possible to omit the where clause and just place the condition in the having like this:

SELECT Customer, Date, SUM(OrderTotal) FROM orders 
GROUP BY Customer, Date HAVING SUM(OrderTotal) > 100 AND Date < '2012-02-20';

In this particular case the results will be the same but care should be taken when doing this as you could end up with different results to those that you were expecting. For example this query won’t work because anything in the having clause must be present in the select:

SELECT Customer, SUM(OrderTotal) FROM orders 
GROUP BY Customer 
HAVING SUM(OrderTotal) > 100 AND Date < '2012-02-20';

This query though will turn up a useful result showing us all customers who ordered more than £100 of good before 2012-02-20:

SELECT Customer, SUM(OrderTotal) FROM orders 
WHERE Date < '2012-02-20'
GROUP BY Customer 
HAVING SUM(OrderTotal) > 100;
Posted in SQL and tagged .

Leave a Reply

Your email address will not be published. Required fields are marked *