Click here to Skip to main content
11,409,854 members (55,304 online)
Click here to Skip to main content
Technical Blog

Tagged as

SQL Query Run Time Estimation

, 31 May 2011 CPOL
Rate this:
Please Sign up or sign in to vote.
A followup to a post about subqueries with an estimation for how long it would take for the query to run.In a recent post about subqueries, I compared 4 queries and how they handled 4 million rows of data. The queries involved:1) Joins with AND conditions.2) Subqueries with AND conditions.3) Joins

A followup to a post about subqueries with an estimation for how long it would take for the query to run.

In a recent post about subqueries, I compared 4 queries and how they handled 4 million rows of data. The queries involved:

1) Joins with AND conditions.
2) Subqueries with AND conditions.
3) Joins with OR conditions.
4) Subqueries with OR conditions.

I compared the speed differential between all of them [on SQL Server], but the last of the queries never finished. As I indicated I might do, I left it running overnight – it still never finished!

Well, my curiosity about how long it would take for the query to finish wasn’t diminished – it was only enhanced. How could I find out and at least be able to estimate how long it would take? I needed to have some sort of understanding of how the performance was degrading as more records were added to the table. So I did that. I ran the query using a smaller number of records in the StudentTestData table. Here’s the results:

# records Run Time
25,000 6 seconds
50,000 23 seconds
100,000 99 seconds
200,000 505 seconds

My anticipation is SQL Server was doing some sort of cross-join type comparison in the background, and the numbers above seem to support this. The efficiency, in Big-O notation, is around O(n^2). Since I want to work with actual numbers to estimate how long the query would run with N = 4 million rows, here’s a rough function:

Number of Seconds = 6*(N/25000)^2, where N is the number of rows.

This function underestimates the time requirement as you add more records, as it is slightly too low for N=100,000 and it is about 2 minutes off for N=200,000. Despite this, it gives me something to work with.

As a spot-check, I tried N = 80,000 students. The function expectation is 61 seconds, and the actual time the query ran was 60 seconds. That’s about as close as I can ask for.

But what about if N is 4 million rows as it was in the previous post? Based on the function, the expectation is over 43 hours. Also, this is an understatement given the results above for N=200,000. It’s a good thing I didn’t wait for it to finish!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Andrew Zwicker

United States United States
No Biography provided

Comments and Discussions

 
-- There are no messages in this forum --
| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.150414.5 | Last Updated 31 May 2011
Article Copyright 2011 by Andrew Zwicker
Everything else Copyright © CodeProject, 1999-2015
Layout: fixed | fluid