Click here to Skip to main content
14,300,951 members

List<T> - Is it really as efficient as you probably think?

Rate this:
4.84 (124 votes)
Please Sign up or sign in to vote.
4.84 (124 votes)
5 Mar 2015CPOL
A .NET List is a list you can use as an array. Isn't it? Yeah, I'm sure it is. And I can use it anytime, can't I? Well, "No". And "No". Sometimes it's not a good idea at all.

Introduction

What is a List<T>? Well, according to MSDN[^]:

List<T> Class

Represents a strongly typed list of objects that can be accessed by index. Provides methods to search, sort, and manipulate lists.

So, a List<T> is a list of items that you can treat as an array. Excellent. We’re done here.

Well…no. Not really. MSDN is not exactly telling the whole truth here.

Let's look a little more closely

So, a List<T> is a List of elements of type T that you can treat as an Array. Simple – we’re done.

Hold your horses there! Are you sure?

Microsoft released the Reference Sources for .NET and they can make interesting reading. One in particular: http://referencesource.microsoft.com/#mscorlib/system/collections/generic/list.cs,cf7f4095e4de7646[^]

That’s the source code for the List<T> class. And let’s have a quick look at the fields:

public class List<t> : IList<t>, System.Collections.IList, IReadOnlyList<t>
    {
        private const int _defaultCapacity = 4;
 
        private T[] _items;
        [ContractPublicPropertyName("Count")]
        private int _size;</t></t></t>

Hang on a moment…

private T[] _items;

Does that mean what I think it does?

        static readonly T[]  _emptyArray = new T[0];        
            
        // Constructs a List. The list is initially empty and has a capacity
        // of zero. Upon adding the first element to the list the capacity is
        // increased to 16, and then increased in multiples of two as required.
#if !FEATURE_CORECLR
        [TargetedPatchingOptOut("Performance critical to inline across NGen image boundaries")]
#endif
        public List() {
            _items = _emptyArray;
        }

Yes. A List<T> is an Array that you can treat as a List, not the other way round.

That's a big difference: it implies some operations do not work as the class name might imply:

  • A List is an Array, so when you try to add an element and there is no room, and bigger array is allocated and all existing elements are copied into it (If you are familiar with a StringBuilder, that is exactly what it does as well).
  • When you insert an element, the data for all elements with a higher index has to be copied into a new location.
  • When you delete an element, the data for all elements with a higher index has to be copied to a new location.
  • If you add an element and the internal array is full, a new, larger internal array is allocated and all existing data is copied to the new list before the new element can be added at the end.

And that's important, because it defines what the various operations you can do on a List<T> can cost in terms of time. 

The first thing to note is that as an array, a List<T> is very efficient to access individual elements via the index.

i = myList[1234567];

is a trivial operation: get the start of the array in memory from the myList reference, multiply the index value by the number of bytes in the type T, add it to the start and fetch the element value. In performance terms, it's (almost) as fast as a "straight" array - not quite, as there is some additional checking code added which makes it slightly slower.

But the other operations which make a List<T> more useful than an Array all come with a cost - and that can be quite a large cost.

Adding items

When you us the Add method to add an item to a List<T>, it checks to see if there is room in the existing array, and if not it:

  • Creates a new array, twice the size of the old one.
  • Copies each entry from the old array to the new one,
  • Sets the "last" entry in the new array to the new value, and moves the "last" indicator on by one.

Image 1

Think about what this means, just in terms of memory allocation and usage:

If you want to add 2 million elements to a List:

  • Adding the first creates an Array containing 4 elements.
  • Adding the fifth creates an Array containing 8 elements, and copies the existing 4
  • Adding the ninth creates an Array containing 16 elements, and copies the existing 8
  • Adding one more creates an Array containing 32 elements, and copies the existing 16
  • Adding one more creates an Array containing 64 elements, and copies the existing 32
  • Adding one more creates an Array containing 128 elements, and copies the existing 64
  • Adding one more creates an Array containing 256 elements, and copies the existing 128
  • Adding one more creates an Array containing 512 elements, and copies the existing 256
  • Adding one more creates an Array containing 1024 elements, and copies the existing 512
  • Adding one more creates an Array containing 2048 elements, and copies the existing 1024
  • Adding one more creates an Array containing 4096 elements, and copies the existing 2048
  • Adding one more creates an Array containing 8192 elements, and copies the existing 4096
  • Adding one more creates an Array containing 16384 elements, and copies the existing 8192
  • Adding one more creates an Array containing 32768 elements, and copies the existing 16384
  • Adding one more creates an Array containing 65536 elements, and copies the existing 32768
  • Adding one more creates an Array containing 131072 elements, and copies the existing 65536
  • Adding one more creates an Array containing 262144 elements, and copies the existing 131072
  • Adding one more creates an Array containing 524288 elements, and copies the existing 262144
  • Adding one more creates an Array containing 1048576 elements, and copies the existing 524288
  • Adding one more creates an Array containing 2097152 elements, and copies the existing 1048576

So to get sufficient space to hold your data, .NET will do 20 separate memory allocations (using both the small and large object heaps), and allocate and copy 2,097,148 elements that you will never use.

OK, that’s not a lot of memory these days – if the elements are reference types it’s 8 bytes per element for a 64bit system - but that means nearly twice the memory you need will be allocated unnecessarily, and the copy operations will not be trivial in time terms. It may even kick in the Garbage Collector a couple of times if your element count / size combination gets big enough.

Inserting items

When you use the Insert method, it's worse. A lot worse.

Start by doing the Add checks, and allocate and copy if there isn't enough room.

Then, move every element of the array with an index greater than or equal to the value of the insert point index one place to the right.

So if you insert a single element at the begining of a List<T> you potentially will copy the entire array twice in effect: once to allocate enough space, and then again to make room at the front.

Image 2

Deleting items

Delete is better than Insert, becasue it doesn't need the check-the-size-and-copy operations - but it still requires the processor to move every single element of the array with an index greater than or equal to the value of the delete point index one place to the left.

Image 3

Cost of moving memory

Moving and copying memory is slow: there are hardware instructions in most processors to speed the operation up (mostly by removing the setup costs at the start of each instruction) but it still requires the processor to fetch the data from memory (or it's cache(s)) and write the value to a new location (which will normally be write-through on the cache rather than "held" internally).

If you are copying or moving a lot of information, it is quite possible that you will exceed the processor caches, and that means that the data caches will end up containing nothing except your data! Bulk data copies are slow, and should be avoided at all costs - they can slow your whole system!

For example, you might be trying to add elements of an existing array in reverse order, so instead of using Add, you use Insert with a zero index:

int elements = 50000;
int[] data = Enumerable.Range(1, elements).ToArray();
long addSum = 0;
long insertSum = 0;
for (int i = 0; i < 20; i++)
    {
    Stopwatch s1 = new Stopwatch();
    s1.Start();
    List<int> lAdd = new List<int>();
    for (int index = 0; index < elements; index++)
        {
        lAdd.Add(data[index]);
        }
    s1.Stop();
    Stopwatch s2 = new Stopwatch();
    s2.Start();
    List<int> lInsert = new List<int>();
    for (int index = 0; index < elements; index++)
        {
        lInsert.Insert(0,data[index]);
        }
    s2.Stop();
    addSum += s1.ElapsedMilliseconds;
    insertSum += s2.ElapsedMilliseconds;
    Console.WriteLine("{0}:{1}", s1.ElapsedMilliseconds, s2.ElapsedMilliseconds);
    }
Console.WriteLine("-----");
Console.WriteLine("Add:    Total   = {0}\n        Average = {1}", addSum, addSum / 20);
Console.WriteLine("Insert: Total   = {0}\n        Average = {1}", insertSum, insertSum / 20);

 

And the results we get:

1:694
0:632
0:649
0:636
0:635
0:632
0:633
0:623
0:627
0:635
0:631
0:632
0:708
0:654
0:642
0:629
0:639
0:636
0:625
0:629
-----
Add:    Total   = 1
        Average = 0
Insert: Total   = 12821
        Average = 641

Speak for themselves!

Over 600 times slower to do inserts than to add items. And bear in mind, these are only 50,000 integers we are moving here! (I initially tried with 500,000 to get a "reasonable" figure for the Add operations - but the Inserts crashed my debugger :omg: )

 

Summary

I’m not saying that you shouldn't use List<T> - no way, it’s one of the most useful classes in .NET – but you should be aware that using it can have a serious performance impact if you do the wrong thing, or you don’t think carefully about exactly how you are using it.

  • If you know (even roughly) how many elements you will be adding to your list in advance, use the overloaded constructor to allocate a suitable sized array to start with: That will save both memory allocations and time in copying elements form one array to another.
  • If you want a one dimensional array, you are probably better off using a List<T> instead. You will use the same amount of memory, access time will be about the same, and you will gain flexibility in your code.
  • Avoid inserting elements near the front of a list: each time you do, you will move all subsequent elements in memory. Where ever possible, use Add instead to avoid this.
  • Use InsertRange and RemoveRange whenever possible instead of individual Insert and Remove operations.
  • When deleting elements, think very carefully. If you can delete from the tail forward instead of head back, do – it saves moving elements you are going to delete anyway.
  • If you are going to insert or delete more than a couple of elements that do not have sequential indexes, think carefully: it may be more efficient to construct a new List<T> and do the copying yourself. But this is a difficult call: Array.Copy (used internally by the insert and Remove methods) is a lot faster than individual element copying as it can use “bulk move” codes. It may be worth timing your code to work this out - but it''s going to depend on too many variables to provide a generalized "this is faster" recommendation.
  • Always use List<T>.Clear or create a new List<T> instead of deleting the elements of an existing List<T> one-by-one.
  • You should also be aware that it means that there is a maximum size for a List<T> - because it is an Array internally, that means it is subject to the same size restrictions. And that means that a List<T> can never be bigger than 2GB - the largest any single object in .NET can be. So for a 64 bit system, a List<AnyReferenceType> can only hold 268,435,456 elements (twice that for 32 bit applications). The number of elements you can get in a List<AnyValueType> will depend on the size of the value type and any "packing" .NET does or doesn't do.

History

2019-03-18 Typos.

2015-01-27 Original Version.

2014-03-05 V1.0

The first version got some strange comments from a very small number of readers: 13 out of 12,009 views were negative - and it seemed that having explanations of what an Array and a List were at the top meant that some people just didn't read beyond that, and immediately voted this a one. That's not a problem in itself - everyone is allowed an opinion - but it implies that they didn't get to the "meat" of the article and missed why the use of an Array for something called a List can cause problems if you don't know what is going on behind the scenes.

So, the initial intro was ripped out, and replaced with a more detailed explanation of what actually happens when you use a List<T> - and particularly when you misuse a List<T>

 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

OriginalGriff
CEO
Wales Wales
Born at an early age, he grew older. At the same time, his hair grew longer, and was tied up behind his head.
Has problems spelling the word "the".
Invented the portable cat-flap.
Currently, has not died yet. Or has he?

Comments and Discussions

 
General.net arrays probably aren't as efficient as many think either Pin
PIEBALDconsult24-Mar-17 13:36
protectorPIEBALDconsult24-Mar-17 13:36 
Generaldon't panic - and moreover an alternative Pin
Mr.PoorEnglish6-Sep-15 11:06
memberMr.PoorEnglish6-Sep-15 11:06 
GeneralMy Vote of 5 Pin
_Asif_25-Mar-15 20:30
professional_Asif_25-Mar-15 20:30 
GeneralMy vote of 5 Pin
Raul Iloc15-Mar-15 23:18
professionalRaul Iloc15-Mar-15 23:18 
QuestionAlternatives? Pin
Nadeem Jamali9-Mar-15 4:58
memberNadeem Jamali9-Mar-15 4:58 
AnswerRe: Alternatives? Pin
Dmitriy Gakh29-Apr-16 16:38
professionalDmitriy Gakh29-Apr-16 16:38 
GeneralMy vote of 2 Pin
Member 76977548-Mar-15 21:31
memberMember 76977548-Mar-15 21:31 
QuestionIt's as good as an array Pin
darrellp7-Mar-15 13:31
memberdarrellp7-Mar-15 13:31 
AnswerRe: It's as good as an array Pin
Dmitriy Gakh29-Apr-16 16:36
professionalDmitriy Gakh29-Apr-16 16:36 
GeneralNo mention of amortization? Pin
Andrew Raybould7-Mar-15 10:15
memberAndrew Raybould7-Mar-15 10:15 
QuestionIs using a linked list any better than using an array Pin
Alaajabre6-Mar-15 18:26
memberAlaajabre6-Mar-15 18:26 
AnswerRe: Is using a linked list any better than using an array Pin
OriginalGriff6-Mar-15 23:21
protectorOriginalGriff6-Mar-15 23:21 
AnswerRe: Is using a linked list any better than using an array Pin
cor287912-Mar-15 6:34
membercor287912-Mar-15 6:34 
AnswerRe: Is using a linked list any better than using an array Pin
Dmitriy Gakh29-Apr-16 16:43
professionalDmitriy Gakh29-Apr-16 16:43 
GeneralMy vote of 5! Pin
jediYL6-Mar-15 17:14
professionaljediYL6-Mar-15 17:14 
Questiongreat article Pin
DaveBlack6-Mar-15 10:53
memberDaveBlack6-Mar-15 10:53 
AnswerRe: great article Pin
Dmitriy Gakh29-Apr-16 16:49
professionalDmitriy Gakh29-Apr-16 16:49 
Suggestion[My vote of 1] Could be reduced to a two-sentence tip Pin
mldisibio6-Mar-15 8:49
membermldisibio6-Mar-15 8:49 
GeneralMy vote of 5 Pin
Vladimir Nikitenko6-Mar-15 8:36
memberVladimir Nikitenko6-Mar-15 8:36 
GeneralWhile this is definitely good information Pin
cor28796-Mar-15 8:02
membercor28796-Mar-15 8:02 
Question13 out of 12,009 views were negative?? Pin
George Swan5-Mar-15 8:47
memberGeorge Swan5-Mar-15 8:47 
AnswerRe: 13 out of 12,009 views were negative?? Pin
OriginalGriff6-Mar-15 5:44
protectorOriginalGriff6-Mar-15 5:44 
GeneralRe: 13 out of 12,009 views were negative?? Pin
Super Lloyd14-Jul-16 19:16
memberSuper Lloyd14-Jul-16 19:16 
GeneralMy vote of 5 Pin
newton.saber5-Mar-15 3:49
membernewton.saber5-Mar-15 3:49 
GeneralRe: My vote of 5 Pin
OriginalGriff5-Mar-15 3:56
protectorOriginalGriff5-Mar-15 3:56 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Article
Posted 1 Feb 2015

Tagged as

Stats

72.2K views
45 bookmarked