Click here to Skip to main content
13,201,452 members (70,248 online)
Click here to Skip to main content
Add your own
alternative version


104 bookmarked
Posted 22 Aug 2011

Database performance optimization part 2 (Index maintenance)

, 23 Aug 2011
Rate this:
Please Sign up or sign in to vote.
In this second part of series of articles about database performance optimization I will discuss about index maintenance.


In my first article about database performance tuning I have described indexing strategies and I have mentioned some aspects you should keep in mind during designing your indexes. Now I will continue to discuss about database tuning. In this second article I will tell you about index maintenance.

There is nothing really new that indexes have their right to exist in SQL Server. Implementing indexes is not so trivial task because well designed index strategy can rapidly increase performance of database but on the other hand awkward indexes can cause performance degradation. As almost everything in database need maintenance, indexes need it too. Index maintenance is an uncommon and very difficult task because some operations could take a long time and you must keep this in mind when you are planning maintenance.

As DBA you must ensure optimal performance of the database. As I mentioned above one of the key tasks is index maintenance. Making a good index means to minimize I/O operations. There are some activities you as DBA should perform:

  • Reorganizing and rebuilding indexes. This process involves defragmenting of indexes and optimization of disk space.
  • Using the Fill factor option to fine tune the index data storage and performance.
  • Performing index operations online by using the LINE index option to provide user access to data during index operations.
  • Configuring parallel Index operations by using the max degree of parallelism option.
  • Using the Query optimizer to develop optimal query plans based on statistics.
  • Improving the performance of bulk copy operations on tables with clustered and non clustered indexes.
  • Selecting suitable Recovery model of the database for index operations and listing the types of logging available.
About all of this activities we will discuss later int this article. At the beginning I consider to be appropriate to determine some basic terms which we will use later.

Index selection

Understand Clustered indexes is very easy in the area of index selection. Clustered indexes are basically keys that reference each row uniquely. Even if you define a clustered index and do not declare it as unique, SQL Server still makes the clustered index unique behind the scenes by adding a 4-byte "uniqueifier" to it. The additional "uniqueifier" increases the width of the clustered index, which causes increased maintenance time and slower searches. Since clustered indexes are the key that identifies each row, they are used in every query.

When we start talking about non-clustered indexes, things get confusing. Queries can ignore non-clustered indexes for the following reasons:

  1. High fragmentation – If an index is fragmented over 40%, the optimizer will probably ignore the index because it's more costly to search a fragmented index than to perform a table scan.
  2. Uniqueness – If the optimizer determines that a non-clustered index is not very unique, it may decide that a table scan is faster than trying to use the non-clustered index. For example: If a query references a bit column (where bit = 1) and the statistics on the column say that 75% of the rows are 1, then the optimizer will probably decide a table scan will get the results faster versus trying to scan over a non-clustered index.
  3. Outdated statistics – If the statistics on a column are out of date, then SQL Server can misguide the benefit of a non-clustered index. Automatically updating statistics doesn't just slow down your data modification scripts, but over time it also becomes out of sync with the real statistics of the rows. Occasionally it's a good idea to run sp_updatestats or UPDATE STATISTICS.
  4. Function usage – SQL Server is unable to use indexes if a function is present in the criteria. If you're referencing a non-clustered index column, but you're using a function such as convert(varchar, Col1_Year) = 2004, then SQL Server cannot use the index on Col1_Year.
  5. Wrong columns – If a non-clustered index is defined on (col1, col2, col3) and your query has a where clause, such as "where col2 = 'somevalue'", that index won't be used. A non-clustered index can only be used if the first column in the index is referenced within the where clause. A where clause, such as "where col3 = 'someval'", would not use the index, but a where clause, like "where col1 = 'someval'" or "where col1='someval and col3 = 'someval2'" would pick up the index.
  6. The index would not use col3 for its seek, since that column is not after col1 in the index definition. If you wanted col3 to have a seek occur in situations such as this, then it is best if you define two separate non-clustered indexes, one on col1 and the other on col3.

Page splits

To store data, SQL Server uses pages that are 8 kb data blocks. The amount of data filling the pages is called the fill factor, and the higher the fill factor, the more full the 8 kb page is. A higher fill factor means fewer pages will be required resulting in less IO/CPU/RAM usage. At this point, you might want to set all your indexes to 100% fill factor; however, here is the gotcha: Once the pages fill up and a value comes in that fits within a filled-up index range, then SQL Server will make room in an index by doing a "page split."

In essence, SQL Server takes the full page and splits it into two separate pages, which have substantially more room at that point. You can account for this issue by setting a fill-factor of 70% or so. This allows 30% free space for incoming values. The problem with this approach is that you continually have to "re-index" the index so that it maintains a free space percentage of 30%.


While indexes can speed up execution of queries several fold as they can make the querying process faster, there is overhead associated with them. They consume additional disk space and require additional time to update themselves whenever data is updated, deleted or appended in a table. Also when you perform any data modification operations (INSERT, UPDATE, or DELETE statements) index fragmentation may occur and the information in the index can get scattered in the database. Fragmented index data can cause SQL Server to perform unnecessary data reads and switching across different pages, so query performance against a heavily fragmented table can be very poor. In this article I am going to write about fragmentation and different queries to determine the level of fragmentation. There are two major types of fragmentation: logical (external fragmentation) and SQL Server fragmentation.

Logical fragmentation (external)

Logical fragmentation occurs when an index leaf page is not in logical order, in other words it occurs when the logical ordering of the index does not match the physical ordering of the index. This causes SQL Server to perform extra work to return ordered results. For the most part, external fragmentation isn’t too big of a deal for specific searches that return very few records or queries that return result sets that do not need to be ordered. Page splits could occur in case of:

  • insert or updates
  • heavy deletes that can cause pages be removed from the page chain, resulting in dis-contiguous page chain

SQL Server fragmentation (internal)

This type of fragmentation occurs when there is too much free space in the index pages. Typically, some free space is desirable, especially when the index is created or rebuilt. You can specify the Fill Factor setting when the index is created or rebuilt to indicate a percentage of how full the index pages are when created. If the index pages are too fragmented, it will cause queries to take longer (because of the extra reads required to find the dataset) and cause your indexes to grow larger than necessary. If no space is available in the index data pages, data changes (primarily inserts) will cause page splits as discussed above, which also require additional system resources to perform. Internal fragmentation usually occurs when:

  • random deletes resulting in empty space on data pages
  • page-splits due to insert or updates
  • shrinking the row such as when updating a large value to a smaller value
  • using fill factor of less than 100
  • using a large row sizes

Extent fragmentation

Extent fragmentation occurs when the extents of a table or index are not contiguous with the database leaving extents from one or more indexes intermingled in the file. This can occur due to:

  • random deletes, which could leave some of the pages in an extent unused while the extent itself is still reserved as part of the table’s space allocation. Think of it like Internal fragmentation, but in extents instead of pages
  • deletes on ranges of contiguous rows within the table, causing one or more entire extents to become de-allocated, thus leaving a gap between the surrounding extents of the table or index
  • interleaving of a table’s data extents with the extents of other objects

How to check fragmentation

Heavily fragmented indexes can degrade query performance significantly and cause the application accessing it to respond slowly. To help you identify the amount of fragmentation SQL Server 2005 and 2008 come with sys.dm_db_index_physical_stats dynamic management function.  

The following query returns DETAILED information about fragmentation level of AdvendureWorks2008R2 database.

	OBJECT_NAME(stats.object_id) as [Object_Name], as [Index_Name] ,
	(select OBJECT_ID, index_id, avg_fragmentation_in_percent, avg_page_space_used_in_percent
		from sys.dm_db_index_physical_stats (DB_ID('AdventureWorks2008R2'),null,null,null,'DETAILED')
		where index_id<>0) as stats
join sys.indexes idx
on idx.object_id=stats.object_id 
and idx.index_id = stats.index_id
You have to focus on 2 values. First is avg_fragmentation_in_percent column to identify external fragmentation and the second is avg_page_space_used_in_percent to identify internal fragmentation. When index has avg_fragmentation_in_percent value greater than 10 that indicates, that index is highly fragmented. Internal fragmentation is indicated when avg_page_space_used_in_percent value falls below 75.

Resolving fragmentation

When you want to resolve some fragmentation issues, SQL Server offers you 2 ways how to fix issues: ALTER INDEX ... REORGANIZE or ALTER INDEX ... REBUILD.

Reorganizing indexes

In SQL Server 2005 the ALTER INDEX REORGANIZE statement has replaced the DBCC INDEXDEFRAG statement. A single partition of a partitioned index can be reorganized using this statement.

This statement reorganizes the leaf level of the clustered and non clustered indexes on tables and views are reorganized and reordered to match the logical order—i.e. left to right of the leaf nodes. The index is organized within the allocated pages and if they span more than one file they are reorganized one at a time. No pages are migrated between files. Moreover, pages are compacted and empty pages created as a consequence are removed and the disk space released. The compaction is determined by the fill factor value in sys.indexes catalog view. Large object data types contained in clustered index or underlying tables will also be compacted by default if the LOB clause is present.

The good news is that the reorganize process is economical on the system resources and is automatically performed online. There are no long term blocking locks which jam up the works!

DBAs are advised to reorganize the index when it is minimally fragmented. Heavily fragmented indexes will require rebuilding. Microsoft recommends reorganising an index when 60 < avg_page_space_used_in_percent < 75 or 5 < avg_fragmentation_in_percent  < 30. 

Rebuilding indexes

When an index is rebuilt, it is dropped and a new one is created. This process is very resource intensive and removes external and internal fragmentation.  External fragmentation is removed by reordering the index rows in contiguous pages. Internal fragmentation is removed by fill factor. You can use ONLINE option which causes that table and it's indexes are available for selecting and data modification. 

Table locks during index rebuilds

When you are rebuilding an index by defaul SQL Server locks table to prevent any data modification. To override this behavior you can use ONLINE option. In this case SQL Server does index rebuild in3 phases.

  • Preparation phase
  • Build   
  • Final 

In preparation phase all system metadata are collected to create a new empty index structure. Snapshot of table is defined and row versioning is used to provide transaction-level read consistency. Any concurrent  write operations on the table are blocked for a very short time.

During build phase the data is scanned, sorted, merged and inserted into the target. Concurrent user select, insert, update, and delete operations are applied to both the preexisting indexes and any new indexes being built. 

Before the final phase, all uncommitted update transactions must be completed.  All read and write operations are blocked for a very short time until this phase is completed. System metadata is updated to replace source object by the target. 

Parallel index rebuilds 

SQL Server 2005 Enterprise Edition supports using multiple processors by the following index creation and rebuild statements:

  • DROP INDEX (only for clustered indexes)
  • ALTER TABLE ADD CONSTRAINT (only for index constraints)
  • ALTER TABLE DROP CONSTRAINT (only for clustered indexes)
The number of processors used depends on the 'max degree of parallelism' configuration option and the amount of free resources at the time when the index is being created to rebuilt. Using all available processors for building large indexes would provide the best performance; however, this could also cause severe shortage of resources for user queries. Therefore if SQL Server determines that the system is busy it will only use a subset of processors for index operations. You can impose further restrictions for processor usage on index operations by using MAXDOP query hint, for example, the following statement limits the ALTER INDEX statement to 2 processors:

ALTER INDEX PK_DimProduct_ProductKey ON DimProduct REBUILD  WITH (MAXDOP = 2)

If MAXDOP hint specifies 1 all index operations will be sequential and not parallel; 0 is the default value and does not restrict the number of processors. Any value above 1 will be interpreted as number of processors to be considered, however, the actual number of processors used for building an index might be lower depending on system load.

Parallel index operations might be particularly resource intensive for non-aligned partitioned indexes. This is because SQL Server must build one sort table for each partition (either on the respective file group where the partition is stored or in tempdb, depending on whether SORT_IN_TEMPDB option is used). If the index is aligned with the table partitioning scheme or with the clustered index then sort tables are built sequentially for each partition. But for non-aligned indexes all sort tables are built in one operation unless you set maximum degree of parallelism to 1. The higher the degree of parallelism the more sort tables must be built for non-aligned partitioned indexes at the same time and therefore, the more memory will be required. If the system doesn't have enough memory to create enough sort tables for an index on a table with numerous partitions the operation will fail. If so, you can normally work around the problem by specifying lower degree of parallelism.

How to rebuild all indexes in database

As I mentioned earlier one of the key task of DBA is to maintain indexes. Sometimes indexes need to rebuild. This task could be a time consuming in case of many indexes. ALTER INDEX statement allows you to rebuild your indexes but you must specify object name for which you want to rebuild indexes.Using ALTER INDEX ALL option allows you to rebuild all indexes for specified table or view but what if you have a lot of tables? That could be a problem because you must create ALTER statement for all tables or views you want to rebuild indexes. Here I provide you a script for rebuilding all indexes for all tables and databases.

DECLARE @Database VARCHAR(255)   
DECLARE @fillfactor INT 

SET @fillfactor = 90 

SELECT name FROM MASTER.dbo.sysdatabases   
WHERE name NOT IN ('master','msdb','tempdb','model','distribution')   

OPEN DatabaseCursor  

FETCH NEXT FROM DatabaseCursor INTO @Database  

   SET @cmd = 'DECLARE TableCursor CURSOR FOR SELECT ''['' + table_catalog + ''].['' + table_schema + ''].['' + 
  table_name + '']'' as tableName FROM ' + @Database + '.INFORMATION_SCHEMA.TABLES 
  WHERE table_type = ''BASE TABLE'''   

   -- create table cursor  
   EXEC (@cmd)  
   OPEN TableCursor   

   FETCH NEXT FROM TableCursor INTO @Table   

   	IF (@@MICROSOFTVERSION / POWER(2, 24) >= 9)
           -- SQL 2005 or higher command 
           SET @cmd = 'ALTER INDEX ALL ON ' + @Table + ' REBUILD WITH (FILLFACTOR = ' +CONVERT(VARCHAR(3),@fillfactor) + ')' 
           EXEC (@cmd) 
          -- SQL 2000 command 
          DBCC DBREINDEX(@Table,' ',@fillfactor)  

   	FETCH NEXT FROM TableCursor INTO @Table   

   CLOSE TableCursor   
   DEALLOCATE TableCursor  

   FETCH NEXT FROM DatabaseCursor INTO @Database  
CLOSE DatabaseCursor   
DEALLOCATE DatabaseCursor

This script rebuilds all indexes for all tables and databases. If you want to rebuild indexes only for some databases, just rewrite:

WHERE name NOT IN ('master','msdb','tempdb','model','distribution')  
WHERE name IN ('myDB1','myDB2')   

This script was created by Greg Robidoux and more about this you can find here

Index usage information

Indexes can rapidly affects performance of database. Too few indexes can cause scans and hence dramatically increase response time of your application. Too many indexes causes overhead for index maintenance during data updates and also a bloated database. It is very difficult to know what indexes are being used and how they are being used. Sometimes is very good to get rid of indexes that are not used. For this type of task SQL Server comes with many new dynamic management views. Two new views that provide data about index usage are sys.dm_db_index_operational_stats and sys.dm_db_index_usage_stats.


This dynamic management view gives you information about insert, update and delete operations for particular index. This dmv also provides information about locking, latching and access methods. There are several columns that are returned from this type of view:

  • leaf_insert_count - total count of leaf level inserts
  • leaf_delete_count - total count of leaf level deletes
  • leaf_update_count - total count of leaf level updates
  • row_lock_count - total count of row locks
  • row_lock_wait_count - total count of times the Database Engine waited on a row lock
  • page_latch_wait_count - total count of times the Database Engine waited, because of latch contention

       I.[NAME] AS [INDEX NAME], 
         ON I.[OBJECT_ID] = A.[OBJECT_ID] 
            AND I.INDEX_ID = A.INDEX_ID 

Here is the output from the above query. From this view we can get an idea of how many inserts, updates and delete operations were performed on each table and index.



This view gives you information about overall access methods to your indexes. There are several columns that are returned from this DMV, but here are some helpful columns about index usage:

  • user_seeks - number of index seeks
  • user_scans - number of index scans
  • user_lookups - number of index lookups
  • user_updates - number of insert, update or delete operations

         I.[NAME] AS [INDEX NAME], 
           ON I.[OBJECT_ID] = S.[OBJECT_ID] 
              AND I.INDEX_ID = S.INDEX_ID 

Here is the output from the above query. From this view we can get an idea of how many seeks, scans, lookups and overall updates (insert, update and delete) occurred.


Following example returns all indexes that have not been used.

    DB_NAME() AS DatabaseName, 
    OBJECT_NAME(idx.OBJECT_ID) AS TableName, 
    idx.NAME AS IndexName, 
	sys.objects obj 
    JOIN sys.indexes idx ON obj.OBJECT_ID = idx.OBJECT_ID 
	not exists (SELECT 
					sys.dm_db_index_usage_stats stats
                WHERE  idx.OBJECT_ID = stats.OBJECT_ID 
					AND idx.INDEX_ID = stats.INDEX_ID) AND obj.TYPE <> 'S' 
ORDER BY DatabaseName, TableName, IndexName

In this query we are listing each user table, all of its indexes and the columns that make up the index. The issue with this query is that you have a row for each column in the index which could get confusing if you have a lot of indexes.

SELECT as TableName, as IndexName, 
    idxcols.key_ordinal, as ColumnName 
	sys.objects obj 
    join sys.indexes idx 
		ON obj.object_id = idx.object_id 
    join sys.index_columns idxcols 
        ON idx.object_id = idxcols.object_id 
    and idx.index_id = idxcols.index_id 
    join sys.columns cols 
		ON idxcols.object_id = cols.object_id 
    and idxcols.column_id = cols.column_id 
	obj.TYPE <> 'S' 
ORDER BY TableName, IndexName, key_ordinal

In this query we use most of Query 3, but we are doing a PIVOT so we can see the index and the index columns in one row. This only accounts for 7 index columns, but it could easily be increased to handle more in the PIVOT operation. Here is another tip related to the use of PIVOT, Crosstab queries using PIVOT in SQL Server 2005. if you would like to better understand how PIVOT works.

         [4] AS COL4,  [5] AS COL5, [6] AS COL6, [7] AS COL7 
                 B.NAME AS INDEXNAME, 
                 B.INDEX_ID AS INDEXID, 
                 D.NAME AS COLUMNNAME, 
          FROM   SYS.OBJECTS A 
                 INNER JOIN SYS.INDEXES B 
                   ON A.OBJECT_ID = B.OBJECT_ID 
                   ON B.OBJECT_ID = C.OBJECT_ID 
                      AND B.INDEX_ID = C.INDEX_ID 
                 INNER JOIN SYS.COLUMNS D 
                   ON C.OBJECT_ID = D.OBJECT_ID 
                      AND C.COLUMN_ID = D.COLUMN_ID 
          WHERE  A.TYPE <> 'S') P 
          FOR KEY_ORDINAL IN ( [1],[2],[3],[4],[5],[6],[7] ) ) AS PVT 

In this query we tie in our PIVOT query above with sys.dm_db_index_usage_stats so we can look at only the indexes that have been used since the last time the stats were reset.

         [4] AS COL4, [5] AS COL5, [6] AS COL6, [7] AS COL7 
                 B.NAME AS INDEXNAME, 
                 D.NAME AS COLUMNNAME, 
          FROM   SYS.OBJECTS A 
                 INNER JOIN SYS.INDEXES B 
                   ON A.OBJECT_ID = B.OBJECT_ID 
                   ON B.OBJECT_ID = C.OBJECT_ID 
                      AND B.INDEX_ID = C.INDEX_ID 
                 INNER JOIN SYS.COLUMNS D 
                   ON C.OBJECT_ID = D.OBJECT_ID 
                      AND C.COLUMN_ID = D.COLUMN_ID 
          WHERE  A.TYPE <> 'S') P 
          FOR KEY_ORDINAL IN ( [1],[2],[3],[4],[5],[6],[7] ) ) AS PVT 
                 FROM   SYS.DM_DB_INDEX_USAGE_STATS B 
                 WHERE  DATABASE_ID = DB_ID(DB_NAME()) 
                        AND PVT.OBJECT_ID = B.OBJECT_ID 
                        AND PVT.INDEX_ID = B.INDEX_ID) 

This query also uses the PIVOT query along with sys.dm_db_index_usage_stats so we can also see the stats on the indexes that have been used.

         [4] AS COL4, [5] AS COL5, [6] AS COL6, [7] AS COL7, B.USER_SEEKS, 
                 B.NAME AS INDEXNAME, 
                 D.NAME AS COLUMNNAME, 
          FROM   SYS.OBJECTS A 
                 INNER JOIN SYS.INDEXES B 
                   ON A.OBJECT_ID = B.OBJECT_ID 
                   ON B.OBJECT_ID = C.OBJECT_ID 
                      AND B.INDEX_ID = C.INDEX_ID 
                 INNER JOIN SYS.COLUMNS D 
                   ON C.OBJECT_ID = D.OBJECT_ID 
                      AND C.COLUMN_ID = D.COLUMN_ID 
          WHERE  A.TYPE <> 'S') P 
          FOR KEY_ORDINAL IN ( [1],[2],[3],[4],[5],[6],[7] ) ) AS PVT 
              AND PVT.INDEX_ID = B.INDEX_ID 
              AND B.DATABASE_ID = DB_ID() 

All of this examples are taken from


  • 22 August - Original version posted


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Kanasz Robert
Architect The Staffing Edge & Marwin Cassovia Soft
Slovakia Slovakia
My name is Robert Kanasz and I have been working with ASP.NET, WinForms and C# for several years.
MCSD - Web Applications
MCSE - Data Platform
MCPD - ASP.NET Developer 3.5
- Web Developer 4
MCITP - Database Administrator 2008
- Database Developer 2008
MCSA - SQL Server 2012
MCTS - .NET Framework 3.5, ASP.NET Applications
- SQL Server 2008, Database Development
- SQL Server 2008, Implementation and Maintenance
- .NET Framework 4, Data Access
- .NET Framework 4, Service Communication Applications
- .NET Framework 4, Web Applications
MS - Programming in HTML5 with JavaScript and CSS3 Specialist

Open source projects: DBScripter - Library for scripting SQL Server database objects

Please, do not forget vote

You may also be interested in...

Comments and Discussions

Question5+ Pin
memlon mulas29-Oct-12 5:15
membermemlon mulas29-Oct-12 5:15 
AnswerRe: 5+ Pin
Kanasz Robert29-Oct-12 5:18
mvpKanasz Robert29-Oct-12 5:18 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web03 | 2.8.171020.1 | Last Updated 23 Aug 2011
Article Copyright 2011 by Kanasz Robert
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid