Though this is an age old concept, but after seeing the ignorance factor of many regarding this, I thought of writing a bit about it. ROLLUP is a GROUP BY modifier that adds extra rows to the output which contain summary data. At each level of aggregation the aggregated column is filled with a NULL value.
Let's see a small example. I know the values are too small for the data presented, but just to keep it readable.
1: mysql> CREATE TABLE `rollup_1` (
2: `url` varchar(255) DEFAULT NULL,
3: `year` int(11) DEFAULT NULL,
4: `country` char(2) DEFAULT NULL,
5: `visit` int(11) DEFAULT NULL
6: ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
7: Query OK, 0 rows affected (0.01 sec)
8:
9: mysql> SELECT * FROM rollup_1;
10: +----------------------+------+---------+-------+
11: | url | year | country | visit |
12: +----------------------+------+---------+-------+
13: | http://www.yahoo.com | 2005 | US | 12000 |
14: | http://www.yahoo.com | 2005 | IN | 10000 |
15: | http://www.yahoo.com | 2006 | US | 16000 |
16: | http://www.yahoo.com | 2006 | IN | 10700 |
17: | http://www.yahoo.com | 2007 | US | 21000 |
18: | http://www.yahoo.com | 2007 | IN | 13200 |
19: +----------------------+------+---------+-------+
20: 6 rows in set (0.00 sec)
21:
22: mysql> SELECT IFNULL(url, 'TOTAL') AS url, SUM(visit) AS visit FROM rollup_1 GROUP BY url WITH ROLLUP;
23: +----------------------+-------+
24: | url | visit |
25: +----------------------+-------+
26: | http://www.yahoo.com | 82900 |
27: | TOTAL | 82900 |
28: +----------------------+-------+
29: 2 rows in set, 1 warning (0.00 sec)
30:
1: mysql> SELECT IFNULL(url, 'ALL_URLS') AS url,
2: -> IFNULL(year, 'ALL_YEARS') AS year,
3: -> IFNULL(country, 'ALL_COUNTRIES') AS country,
4: -> SUM(visit) FROM rollup_1
5: -> GROUP BY url, year, country WITH ROLLUP;
6: +----------------------+-----------+---------------+------------+
7: | url | year | country | SUM(visit) |
8: +----------------------+-----------+---------------+------------+
9: | http://www.yahoo.com | 2005 | IN | 10000 |
10: | http://www.yahoo.com | 2005 | US | 12000 |
11: | http://www.yahoo.com | 2005 | ALL_COUNTRIES | 22000 |
12: | http://www.yahoo.com | 2006 | IN | 10700 |
13: | http://www.yahoo.com | 2006 | US | 16000 |
14: | http://www.yahoo.com | 2006 | ALL_COUNTRIES | 26700 |
15: | http://www.yahoo.com | 2007 | IN | 13200 |
16: | http://www.yahoo.com | 2007 | US | 21000 |
17: | http://www.yahoo.com | 2007 | ALL_COUNTRIES | 34200 |
18: | http://www.yahoo.com | ALL_YEARS | ALL_COUNTRIES | 82900 |
19: | ALL_URLS | ALL_YEARS | ALL_COUNTRIES | 82900 |
20: +----------------------+-----------+---------------+------------+
21: 11 rows in set, 3 warnings (0.00 sec)
1: mysql> INSERT INTO rollup_1 values ('http://www.yahoo.com', 2007, null, 12000);
2: Query OK, 1 row affected (0.00 sec)
3:
4: mysql> SELECT IFNULL(url, 'ALL_URLS') AS url,
5: -> IFNULL(year, 'ALL_YEARS') AS year,
6: -> IFNULL(country, 'ALL_COUNTRIES') AS country,
7: -> SUM(visit) FROM rollup_1
8: -> GROUP BY url, year, country WITH ROLLUP;
9: +----------------------+-----------+---------------+------------+
10: | url | year | country | SUM(visit) |
11: +----------------------+-----------+---------------+------------+
12: | http://www.yahoo.com | 2005 | IN | 10000 |
13: | http://www.yahoo.com | 2005 | US | 12000 |
14: | http://www.yahoo.com | 2005 | ALL_COUNTRIES | 22000 |
15: | http://www.yahoo.com | 2006 | IN | 10700 |
16: | http://www.yahoo.com | 2006 | US | 16000 |
17: | http://www.yahoo.com | 2006 | ALL_COUNTRIES | 26700 |
18: | http://www.yahoo.com | 2007 | ALL_COUNTRIES | 12000 |
19: | http://www.yahoo.com | 2007 | IN | 13200 |
20: | http://www.yahoo.com | 2007 | US | 21000 |
21: | http://www.yahoo.com | 2007 | ALL_COUNTRIES | 46200 |
22: | http://www.yahoo.com | ALL_YEARS | ALL_COUNTRIES | 94900 |
23: | ALL_URLS | ALL_YEARS | ALL_COUNTRIES | 94900 |
24: +----------------------+-----------+---------------+------------+
25: 12 rows in set, 3 warnings (0.00 sec)
- One that appears late in the result set.
- One that has a higher value for SUM(visit) field. This is arguable in general as there might be negative values, for our example we can say that's impossible.
While using ROLLUP, you may further like to consider few points.
- ROLLUP and ORDER BY are mutually exclusive, they cannot be used in the same statement.
- If you are using LIMIT clause along with ROLLUP, remember LIMIT clause is applied after ROLLUP, so it may include the aggregated rows.
- In MySQL documentation, it is mentioned that "Because the
NULL
values in the super-aggregate rows are placed into the result set at such a late stage in query processing, you cannot test them asNULL
values within the query itself.". You can see in the above queries, we are actually testing this value to be replaced by TOTAL_* string. Also if your data itself does not have NULL values in it, it works wonders. Look at the output below, I have removed the row inserted above.
1: mysql> SELECT IFNULL(url, 'ALL_URLS') AS url,
2: -> IFNULL(year, 'ALL_YEARS') AS year,
3: -> IFNULL(country, 'ALL_COUNTRIES') AS country,
4: -> SUM(visit) FROM rollup_1
5: -> GROUP BY url, year, country WITH ROLLUP HAVING country is null;
6: +----------------------+-----------+---------------+------------+
7: | url | year | country | SUM(visit) |
8: +----------------------+-----------+---------------+------------+
9: | http://www.yahoo.com | 2005 | ALL_COUNTRIES | 22000 |
10: | http://www.yahoo.com | 2006 | ALL_COUNTRIES | 26700 |
11: | http://www.yahoo.com | 2007 | ALL_COUNTRIES | 34200 |
12: | http://www.yahoo.com | ALL_YEARS | ALL_COUNTRIES | 82900 |
13: | ALL_URLS | ALL_YEARS | ALL_COUNTRIES | 82900 |
14: +----------------------+-----------+---------------+------------+
15: 5 rows in set, 4 warnings (0.00 sec)
2 comments:
This was a very informative article for the neophyte like me. And the practical examples gave me just the hints I needed to get WITH ROLLUP working in my queries. Thanks!
IFNULL is a great way to improve readability of results, thanks for the tip!
Post a Comment