Why does SQL Server require an aggregate function when grouped by primary key?

Let's say I have two tables

CREATE TABLE users ( name VARCHAR(50) PRIMARY KEY, gender INTEGER );
CREATE TABLE likes ( username VARCHAR(50), object VARCHAR(50) );

      

Now I want to know the genders and the number of likes for each user

SELECT 
    u.name, u.gender, COUNT(*) 
FROM users u
INNER JOIN likes l ON u.name = l.username
GROUP BY u.name

      

Here I am grouping the primary key, which means that there will be exactly one user row in each group. However, SQL Server is giving me the following error:

The column "users.gender" is not valid in the select list because it is not contained in either an aggregate function or a GROUP BY clause.

Why is he complaining? Is there a way that I can achieve the desired behavior?

EDIT: Apparently the behavior in which all non-aggregate columns should be added to the proposal GROUP BY

. I guess the real question is, why is it behaving this way?

+3


source to share


2 answers


The behavior you are describing is indeed ANSI standard. If the column group defines a unique row, then the other values ​​are "functionally dependent" on those columns. They have no other values. These other columns can be included in select

, but not included in group by

.

The way functional dependencies are defined and applied is through primary and unique keys.



So, your desire to include only name

is quite reasonable. SQL Server - and most other databases don't support this (I think this feature is optional for ANSI compliance). Postgres has supported functional dependencies since version 9.1. MySQL "commits" its current one group by

to support this in 5.7.

+2


source


Indexes and constraints do not affect actual queries that you can write. This will require inference, which is possible, but not in the standard, not in SQL Server.



Actually, the optimizer makes all these conclusions. The plan will have no aggregation at all. This is a language design problem and they understandably decided not to allow it. This would do for fragile requests and bring almost zero benefit to customers.

+1


source







All Articles