Introduction
We often get questions asking how to fix the SQL error "Column 'name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause." but it's not an error in the SQL - it's an error in what you are trying to do. Let's look at GROUP BY and see what it does, and why you get the error.
Background
If we have a couple of tables:
Transactions
ID UserID TimeStamp Value
1 1 2016-01-01 14:37:00.000 123.87
2 1 2016-01-01 15:43:00.000 -84.15
3 1 2016-01-02 09:41:00.000 92.34
4 2 2016-01-05 12:14:00.000 37.88
5 3 2016-01-31 17:27:00.000 102.44
6 1 2016-02-02 00:00:00.000 453.92
7 2 2016-02-14 00:00:00.000 87.32
Users
ID UserName
1 Joe White
2 Mike Green
3 Sarah Brown
4 Gillian Black
We can easily do queries on these to give us activity reports:
SELECT u.UserName, t.TimeStamp, t.Value FROM Transactions t JOIN Users u ON u.ID = t.UserID
And we get:
UserName TimeStamp Value
Joe White 2016-01-01 14:37:00.000 123.87
Joe White 2016-01-01 15:43:00.000 -84.15
Joe White 2016-01-02 09:41:00.000 92.34
Mike Green 2016-01-05 12:14:00.000 37.88
Sarah Brown 2016-01-31 17:27:00.000 102.44
Joe White 2016-02-02 00:00:00.000 453.92
Mike Green 2016-02-14 00:00:00.000 87.32
But what if we want a balance for each user instead?
That's not complicated - that's what GROUP BY is there for
So we use a GROUP BY:
SELECT u.UserName, SUM(t.Value) AS Balance
FROM Transactions t
JOIN Users u ON u.ID = t.UserID
GROUP BY u.UserName
And we get a balance for all active users:
UserName Balance
Joe White 585.98
Mike Green 125.20
Sarah Brown 102.44
What the GROUP BY has done is collect all the related rows (in this case all those with the same UserName) together and then apply an Aggregate Function (in this case SUM) to each group:
UserName TimeStamp Value
Joe White 2016-01-01 14:37:00.000 123.87
Joe White 2016-01-01 15:43:00.000 -84.15
Joe White 2016-01-02 09:41:00.000 92.34
Joe White 2016-02-02 00:00:00.000 453.92
------
585.98
Mike Green 2016-01-05 12:14:00.000 37.88
Mike Green 2016-02-14 00:00:00.000 87.32
------
125.20
Sarah Brown 2016-01-31 17:27:00.000 102.44
------
102.44
So each group has got two unique elements: the UserName ("Joe White", "Mike Green", "Sarah Brown") and the Balance (585.98, 125.20, 102.44)
But...each group also has a bunch of TimeStamp values. What happens if we try to use that as well?
SELECT u.UserName, t.TimeStamp, (t.Value) AS Balance
FROM Transactions t
JOIN Users u ON u.ID = t.UserID
GROUP BY u.UserName
Try it, and you get an error:
Msg 8120, Level 16, State 1, Line 1
Column 'Transactions.TimeStamp' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Why? Think about it for a moment, and it's pretty obvious: Which TimeStamp value do you want SQL to use? It's got a choice of 5 for Joe White, 2 for Mike Brown, and only one for poor Sarah! It can't decide between them, because it doesn't know what information you are trying to get, and it doesn't want to give you the wrong info.
You can only list columns which are in the GROUP BY list, or in a function which returns a single value for a range of rows - called an "Aggregate function" because it aggregates (or combines) information.
For SQL Server, the Aggregate functions are:
AVG Average of values
CHECKSUM_AGG Returns the checksum of the values
COUNT Number of items
COUNT_BIG Number of items as a BIGINT
GROUPING }
GROUPING_ID } - See https:
MAX Largest value
MIN Smallest value
STDEV }
STDEVP } - Returns the statistical standard deviation
SUM Total
VAR }
VARP } - Returns the statistical variance
If you try to use any other function, you will get an error.
So think about your data, think about what you are trying to report on, and find the appropriate function.
Don't think "oh, that's easy! I can just add to the GROUP BY list!"
SELECT u.UserName, t.TimeStamp SUM(t.Value) AS Balance
FROM Transactions t
JOIN Users u ON u.ID = t.UserID
GROUP BY u.UserName, t.TimeStamp
Because that creates more groups, not allows you to use more columns in your SELECT LIST.
When you use more than one grouping field, SQL creates a group for each of the rows with distinct values in any of the named fields: in our case it will put each row in a separate group as there are no two identical timestamp values!
History
2016-07-02 First version
2012-08-23 Spelling error: "agregate" for "aggregate". Six times. I got it right once!:O