Click here to Skip to main content
Click here to Skip to main content

The Evil That is "Select *"

By , 10 Aug 2011
 
There I was working on a web page, when I decided that I needed to add some new columns to an existing table. So I did. Forty-five minutes later, or system admin calls, and says a bunch of people are having problems wiht the web sight (and a related desktop app). A co-worker asked if anyone had added new columns to any of the tables, and I raised my hand.
 
It seems that SOMEONE had used SELECT * in one or more SQL queries into the affected table, and the act of adding a column completely hosed up the code as a result of my adding the new columns.
 
Nobody here really knows where the offending code is, or even if it might be in a stored procedure, but the fastest way out of the mess was to remove the columns from the existing table, and create a new table to hold them.
 
This is PRECISELY how spaghetti code is propagated throughout a project, along with apparently pointless tables in the database.
 
For the record, we don't know exactly why we're having problems because we don't have time to find out (because we wouldn't fix it if we did find out). We suspect someone wrote an INSERT or UPDATE statement that tries to do a SELECT * to move some column data around between tables. I can think of only one word to describe this, and the first syllable is "cluster".
 
The tip:
 
DO NOT USE SELECT * IN YOUR SQL QUERIES - FOR ANY REASON.
 
EDIT ================
 
For those of you who felt the need to tell me that the word "cluster" has two syllables - I already know that, and I have no idea where you people got the idea that I couldn't count. That particular statement is part of my endearing yet sardonic wit.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

John Simmons / outlaw programmer
Software Developer (Senior)
United States United States
Member
I've been paid as a programmer since 1982 with experience in Pascal, and C++ (both self-taught), and began writing Windows programs in 1991 using Visual C++ and MFC. In the 2nd half of 2007, I started writing C# Windows Forms and ASP.Net applications, and have since done WPF, Silverlight, WCF, web services, and Windows services.
 
My weakest point is that my moments of clarity are too brief to hold a meaningful conversation that requires more than 30 seconds to complete. Thankfully, grunts of agreement are all that is required to conduct most discussions without committing to any particular belief system.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralMy vote of 2memberpt140115 Jul '12 - 9:21 
This is just a rant. If you're going to give a tip, you should give reasons why your suggestion is worth following - the only reason you give is a single example of someone using Select * in a way that broke *your* code because they used it in a particular way. Select * can be used safely and there are many situations where it's use is valid - though I agree it's generally wasteful on bandwidth and the number of rows returned should be limited. And using it to transfer data between tables is a *BAD* idea.
 
Incidentally, it's also a *BAD* idea to alter the schema and code on a live site without validating the changes on a test system. Don't you have any test/QA procedures in place?
GeneralRe: Yes, there is a way. I just can't recall that now... I have...membergiadich2 Aug '11 - 7:59 
Yes, there is a way. I just can't recall that now... I haven't been doing much work with SQL these days.
GeneralRe: Yes, it's kind of implied that it is "select * where (some l...memberBobJanova2 Aug '11 - 6:57 
Yes, it's kind of implied that it is "select * where (some limiting condition)" to get the rows you want, or if you just want the schema you can use top 0 (but isn't there a proper way to get the schema with column types and everything for that?).
GeneralReason for my vote of 3 John's later post indicates that SEL...memberdmjm-h29 Aug '11 - 5:22 
Reason for my vote of 3
John's later post indicates that SELECT * is the least of his problems. It sounds like everything about the project is a cluster thing. Chris's comment articulates it well.
Lashing out against SELECT * in frustration is understandable but unwarranted. We don't ban every SQL statement variation that might be misused.
GeneralReason for my vote of 5 I couldn’t agree with you more, and ...memberS Douglas25 Aug '11 - 13:31 
Reason for my vote of 5
I couldn’t agree with you more, and to Chris’s point, I always, always alias my tables so its absolutely clear where that data point is coming from.
GeneralReason for my vote of 4 Not entirely true. E.g. select top 1...memberArmando de la Torre4 Aug '11 - 6:40 
Reason for my vote of 4
Not entirely true. E.g. select top 1 * from myTable where key = myValue
GeneralThis article is right but doesn't explain why... There are a...membermsmits2 Aug '11 - 2:26 
This article is right but doesn't explain why... There are a few reasons, the main one being that you want your queries to be predictable:
- If a column is added, (poorly written) client code that uses column indexes may fail.
- If a column is added with a lot of data (say big documents) this may create a big performance impact.
- SQL may use different indexes (or no index at all) if it needs all columns since it may decide that e.g. an index seek + bookmark lookup is more work then doing a table scan.
 
Exceptions (since they return a predictable number of columns and are optimized properly by SQL):
- Using * in subqueries (such as SELECT A FROM (SELECT * FROM T)).
- Using * from a subquery (such as SELECT * FROM (SELECT A FROM T)).
 
Cheers,
Michel
GeneralReason for my vote of 5 Everybody learns this the hard waymemberShahar Eldad1 Aug '11 - 18:56 
Reason for my vote of 5
Everybody learns this the hard way
GeneralReason for my vote of 5 I agree, except "cluster is 2 syllab...subeditorWalt Fair, Jr.1 Aug '11 - 18:48 
Reason for my vote of 5
I agree, except "cluster is 2 syllables, but the following word is only 1.
GeneralReason for my vote of 4 Good read, interesting perspectives,...memberBrianBissell1 Aug '11 - 16:44 
Reason for my vote of 4
Good read, interesting perspectives, especially in the alternatives.
 
Good input from alot of sources here.
GeneralAlso always list columns on an insert INSERT INTO mytab...memberdjj5528 Jul '11 - 3:53 
Also always list columns on an insert
INSERT INTO mytable
SELECT col1, col2, col3 from modtable
If you add a column to modtable this will break.
General'Cluster' is two syllables. Select * is okay when you reall...memberBobJanova26 Jul '11 - 0:26 
'Cluster' is two syllables.
 
Select * is okay when you really do want the whole table schema, typically for taking a temporary copy of some rows for a complex update operation or for displaying it to the user. In business logic it is almost always a mistake, and if you are making assumptions about what it returns it can be a dangerous one.
GeneralRe: Select * isn't okay if you only want the whole table schema....membergiadich2 Aug '11 - 6:39 
Select * isn't okay if you only want the whole table schema.
 
Back in the days when I was still new at [the company] and doing data analyst (with a little background in programming) work, one of the application keeps getting slower and slower by the week. Every time we loaded some data into SQL, it would take longer to run the app - from 1 minute to 5 minutes then 10 minutes... I was so frustrated and started to dig into the code and found the culprit - "Select *".
 
It turned out, one of our programmers wanted to get the schema of the table and used Select * to retrieve it. That is okay if the table is small but if your table is growing constantly then this will be a huge problem.
 
After discovering the problem, I changed the code to Select Top 0 * and the problem was gone and the app ran faster than ever. So if you really want to get the schema of the table and do not know it before hand and need to used Select *, make sure you limit the number of rows returned.
GeneralRe: I *know* "cluster" is two syllables. It's part of my endeari...mvpJohn Simmons / outlaw programmer10 Aug '11 - 11:19 
I *know* "cluster" is two syllables. It's part of my endearing yet sardonic wit.
GeneralReason for my vote of 5 It is the evil death* and should alw...memberNagy Vilmos25 Jul '11 - 19:54 
Reason for my vote of 5
It is the evil death* and should always be avoided.
GeneralReason for my vote of 5 Totally agree with John! Using SELEC...memberDrABELL25 Jul '11 - 17:15 
Reason for my vote of 5
Totally agree with John! Using SELECT* is also a bad practice from the performance prospective. Thanks for sharing this useful tip/use case!
GeneralJust to be Devil's advocate here - well, not quite - doing a...adminChris Maunder25 Jul '11 - 15:19 
Just to be Devil's advocate here - well, not quite - doing a Select Column1,..., ColumnN from Table can be just as dangerous. Was the issue because you had two tables both with a column name with the same name? Select Table.Column1,..., Table.ColumnN from Table will at least allow the stored proc to compile, but even then your rowset will contain duplicate column names.
 
Correct database change management is what will save you here. Select * is not the problem, it's a symptom.
GeneralRe: Very wise. I think you really hit the underlying problem her...memberBrianBissell1 Aug '11 - 16:37 
Very wise. I think you really hit the underlying problem here...
 

GeneralRe: We didn't have the time to figure out exactly why the querie...mvpJohn Simmons / outlaw programmer2 Aug '11 - 1:34 
We didn't have the time to figure out exactly why the queries went nuts. We have until 08/08 to get the changes I'm working on implemented, tested and deployed. All I know is that I added some new columns to an existing table, and another part of the site (AND an associated desktop app) got buggered up as a result. Removing the columns brought the universe back into alignment. None of MY queries use SELECT *.
 
Regarding change mamangement, that would be great if we had an actual DBA on the team. Essentially, there is NO management regarding the source code, and it even appears that much of it isn't even in source control. The only documentation associated with any of the projects we're working on is the documentation I'VE written, and I would venture to guess that fully 99% of the code I didn't write has absolutely NO comments. What we need is three more programmers and a full-time DBA on the team to completely rwrite everything from scratch and get rid of the external processes (like DTS scripts and batch files that move data from a hodgepodge of disparate Access databases into SQL server. The problem with that plan is that we work for a defense contractor, so not only are we not getting anyone new, one of the seats on our contract is going to be deleted, meaning one of the four existing programmers will no longer be with us by next June. Since I'm the new guy, it's highly probable it will be me.

GeneralThere is a script floating aroun that will search the procs ...memberMycroft Holmes25 Jul '11 - 13:10 
There is a script floating aroun that will search the procs and functions for a substring, this will help locate the offending procs. The code that uses select * is a whole nuther problem.
GeneralRe: This is the one I "borrowed" from the net. Searches SPs and...memberSteve Echols25 Jul '11 - 16:02 
This is the one I "borrowed" from the net. Searches SPs and Views:
 
CREATE PROCEDURE [dbo].[Util_FindTextInSP]
(
@StringToSearch varchar(100)
)
AS
BEGIN
 
SET @StringToSearch = '%' +@StringToSearch + '%'
 
SELECT DISTINCT so.Name
FROM sysobjects so (NOLOCK)
INNER JOIN syscomments sc (NOLOCK) on so.Id = sc.ID
AND (so.Type = 'P' OR so.type='V')
AND sc.Text LIKE @StringToSearch
ORDER BY so.Name
 
END

GeneralReason for my vote of 5 So true so many debug sessions that...memberSimon_Whale25 Jul '11 - 11:49 
Reason for my vote of 5
 
So true so many debug sessions that I have done with SQL have been down to a SELECT * moment
GeneralReason for my vote of 5 Agree, causes much confusion and ext...mvpMika Wendelius25 Jul '11 - 10:52 
Reason for my vote of 5
Agree, causes much confusion and extra maintenance. I think the only situation where * is beneficial is when you have no other means to figure out the table's structure (columns, data types etc)

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web02 | 2.6.130523.1 | Last Updated 10 Aug 2011
Article Copyright 2011 by John Simmons / outlaw programmer
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid