GROUP BY

jdavidb on 2007-04-13T15:09:50

The GROUP BY clause of SQL queries is completely redundant. All of the information in it can be derived from the rest of the query. Redundancy in communication is a great thing for human communication, to prevent misunderstandings. It allows for error correction. But it's not a great thing when I'm hand-typing such a verbose language.

GROUP BY should be made optional. In some cases it might allow the SQL parser to detect an error I've made in my query. And in some cases it might be necessary (but so far I can't think of any). I'd probably want to leave it in to programs that I am going to leave lying around. But for queries I'm throwing into an interactive system, I don't want to deal with it.


You assume...

Alias on 2007-04-14T01:54:08

Machine communication is not human communication.

SQL is not about reducing redundancy.

SQL is about offloading work related to data storage retrieval to a dedicated machine.

Should I have to implement my own grouping code in every single application?

Re:You assume...

bart on 2007-04-14T09:52:57

Should I have to implement my own grouping code in every single application?
No, I think his basic idea is that the database should be able to deduce the GROUP BY clause from the rest of the statement. Now, a database complains if you output a column that isn't in an aggregate function, or in the GROUP BY clause. Instead, it could implicitly add the column to the GROUP BY clause.

I don't think that overall it would save much typing. And having to be explicit has its advantages, in cases the SQL statement implies a different grouping than what you had in mind.

Offered without further comment

Aristotle on 2007-04-14T11:08:52

SELECT foo, bar, SUM(quux) FROM qux GROUP BY foo, bar;

SELECT foo, bar, SUM(quux) FROM qux GROUP BY bar, foo;

SELECT foo, SUM(quux) FROM qux GROUP BY foo, bar;

SELECT foo, SUM(quux) FROM qux GROUP BY bar;

Re:Offered without further comment

jdavidb on 2007-04-16T14:47:24

Maybe I need the comments to clue me in. The last two won't work ... is your point that in such an instance having the GROUP BY allows the programmer to figure out what the previous programmer meant and correct? Because I thought I clearly limited myself to talking about interactive, one-shot queries. This is like requiring strict for a perl-one liner. :)

As near as I can tell, there is no difference between the first two, although I'm willing to be enlightened. I may just be sleepy on this early Monday morning.

Re:Offered without further comment

Aristotle on 2007-04-16T17:08:44

Why won’t the last two work?

As for the first two, you’re right, there is no difference. Order matters when you use window functions, but not in ordinary grouping.

Re:Offered without further comment

jdavidb on 2007-04-17T15:17:54

Never mind, they will. It's just not something I do often. And I think you've shown me why that clause can't be made optional. Of course, I'd prefer that it be made optional and that expressing the fields to group by in the SELECT be made mandatory, but I guess that would probably bork a lot of people up. :)