GROUP BY Modifiers
As of MySQL 4.1.1, the GROUP BY clause allows a WITH
ROLLUP modifier that causes extra rows to be added to the summary
output. These rows represent higher-level (or super-aggregate) summary
operations. ROLLUP thus allows you to answer questions at multiple
levels of analysis with a single query. It can be used, for example,
to provide support for OLAP (Online Analytical Processing) operations.
As an illustration, suppose that a table named sales has year,
country,
product, and profit columns for recording sales profitability:
CREATE TABLE sales
(
year INT NOT NULL,
country VARCHAR(20) NOT NULL,
product VARCHAR(32) NOT NULL,
profit INT
);
The table's contents can be summarized per year with a simple GROUP BY
like this:
mysql> SELECT year, SUM(profit) FROM sales GROUP BY year; +------+-------------+ | year | SUM(profit) | +------+-------------+ | 2000 | 4525 | | 2001 | 3010 | +------+-------------+
This output shows the total profit for each year, but if you also want to determine the total profit summed over all years, you must add up the individual values yourself or run an additional query.
Or you can use ROLLUP, which provides both levels of analysis with a
single query.
Adding a WITH ROLLUP modifier to the GROUP BY clause causes the query
to produce another row that shows the grand total over all year values:
mysql> SELECT year, SUM(profit) FROM sales GROUP BY year WITH ROLLUP; +------+-------------+ | year | SUM(profit) | +------+-------------+ | 2000 | 4525 | | 2001 | 3010 | | NULL | 7535 | +------+-------------+
The grand total super-aggregate line is identified by the value NULL in
the year column.
ROLLUP has a more complex effect when there are multiple GROUP BY
columns. In this case, each time there is a ``break'' (change in value)
in any but the last grouping column, the query produces an extra
super-aggregate summary row.
For example, without ROLLUP, a summary on the sales table based
on year,
country, and product might look like this:
mysql> SELECT year, country, product, SUM(profit)
-> FROM sales
-> GROUP BY year, country, product;
+------+---------+------------+-------------+
| year | country | product | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2001 | Finland | Phone | 10 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 2700 |
| 2001 | USA | TV | 250 |
+------+---------+------------+-------------+
The output indicates summary values only at the year/country/product level of
analysis. When ROLLUP is added, the query produces several extra rows:
mysql> SELECT year, country, product, SUM(profit)
-> FROM sales
-> GROUP BY year, country, product WITH ROLLUP;
+------+---------+------------+-------------+
| year | country | product | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | Finland | NULL | 1600 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
| 2000 | India | NULL | 1350 |
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2000 | USA | NULL | 1575 |
| 2000 | NULL | NULL | 4525 |
| 2001 | Finland | Phone | 10 |
| 2001 | Finland | NULL | 10 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 2700 |
| 2001 | USA | TV | 250 |
| 2001 | USA | NULL | 3000 |
| 2001 | NULL | NULL | 3010 |
| NULL | NULL | NULL | 7535 |
+------+---------+------------+-------------+
For this query, adding ROLLUP causes the output to include summary
information at four levels of analysis, not just one. Here's how to
interpret the ROLLUP output:
product column set to NULL.
country and products columns set to NULL.
year, country, and products columns set to
NULL.
Other Considerations When using ROLLUP
The following items list some behaviours specific to the MySQL implementation
of ROLLUP:
When you use ROLLUP, you cannot also use an ORDER BY clause to
sort the results. (In other words, ROLLUP and ORDER BY are mutually
exclusive.) However, you still have some control over sort order.
GROUP BY
in MySQL sorts results, and you can use explicit ASC and DESC keywords
with columns named in the GROUP BY list to specify sort order for
individual columns. (The higher-level summary rows added by ROLLUP
still appear after the rows from which they are calculated, regardless
of the sort order.)
LIMIT can be used to restrict the number of rows returned to the
client. LIMIT is applied after ROLLUP, so the limit applies
against the extra rows added by ROLLUP. For example:
mysql> SELECT year, country, product, SUM(profit)
-> FROM sales
-> GROUP BY year, country, product WITH ROLLUP
-> LIMIT 5;
+------+---------+------------+-------------+
| year | country | product | SUM(profit) |
+------+---------+------------+-------------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | Finland | NULL | 1600 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
+------+---------+------------+-------------+
Note that using LIMIT with ROLLUP may produce results
that are more difficult to interpret, because you have less context
for understanding the super-aggregate rows.
The NULL indicators in each super-aggregate row are produced when the
row is sent to the client. The server looks at the columns named in
the GROUP BY clause following the leftmost one that has changed value.
For any column in the result set with a name that is a lexical match to
any of those names, its value is set to NULL. (If you specify grouping
columns by column number, the server identifies which columns to set to
NULL by number.)
Because the NULL values in the super-aggregate rows are placed into the
result set at such a late stage in query processing, you cannot test them
as NULL values within the query itself. For example, you cannot add
HAVING product IS NULL to the query to eliminate from the output all
but the super-aggregate rows.
On the other hand, the NULL values do appear as NULL
on the client side and can be tested as such using any MySQL client
programming interface.