ABC#
The purpose of this document is to point out what I firmly believe is a bug in the .Net Compiler for C#. This bug is very well hidden, but easy to demonstrate in testing.
Before I get too far along, a little background. I have been coding, professionally since 1982. I have a very deep understanding of programming concepts and I do not need to hear about what variables need to be public etc. I do not need a programming lesson. I would like to see non trivial examples in books and the internet, but I guess that void will be filled by me at some time in the future.
Here is the task: count the number of times event X occurs throughout Y time. The reality is, there are millions of events, thousands of event types, covering over a decade and we’d like to see the percentage of change overtime. So in order to capture the events and the times, we load then into the first available slot, and if we add a new value, we sort the array so that we can perform a binary search prior to the next increment. Executing millions of "For loops" in two dimensions makes the computer squirm.
Simple task. Simple solution. Let’s look at the very basic parts. We need a two dimensional array with a counter to increment.
So let’s be like everybody else and create a trivial example. Both in .Net. One in C#, the other in C++. We’ll compare and contrast the code and the results, line by line, concept by concept. Prepare to be baffled.
Part I: Define / Declare variables
In C#
public int i;
public int j;
public struct ytimeframe
{
public int ytime;
}
public struct xevents
{
public int id;
public ytimeframe[] xy_array;
}
public xevents[] xevents_array;
In English, variables i and j are numbers to be used to reference the X and Y axis of the array. The Y axis is a number – in a real world there would be a Month and a Year. The X axis is the id of the event, we do not know what they are up front – or how many we could have but we make a reasonable estimate and test the limit in real life. For this trivial example we’ll have 5 events and 5 time periods. We can see that in C# we declare that there will be an array containing an array – but we cannot size it yet.
In C++
int i = 0;
int j = 0;
struct ytimeframe
{
int ytime;
} cppisdumb;
struct xevents
{
int id;
ytimeframe xy_array[5];
} xevents_array[5];
The difference in C++ is that the size of the arrays is set up front. I say cppisdumb because you cannot simply define a struct and not immediately declare something of its type, so I did that. I know that a struc with only one item is silly – but – let’s put that to the side for now.
Part II: Initialize the size of the array
In C#
xevents_array = new xevents[5];
for (j = 0; j < 5; j++)
xevents_array[j].xy_array = new ytimeframe[5];
In C# here is where we need to define the size of the arrays. This is not for the timid, as each array Y within X must be sized too.
In C++ - this is already done above – the size is within the declare
Part III: Initialize the array values
In C#
for (j = 0; j < 5; j++)
xevents_array[0].xy_array[j].ytime = j;
xevents_array[0].id = 0;
for (i = 1; i < 5; i++)
{
xevents_array[i] = xevents_array[0];
xevents_array[i].id = i;
}
Now for the uninitiated, no pun intended, this is an advanced concept. We set the initial value of each first Y element. Then we set the initial value of the first X elements. Then we set the other X elements to the same value as the first X. Build a real array and then do the math.
In C++
for (j = 0; j < 5; j++)
xevents_array[0].xy_array[j].ytime = j;
xevents_array[0].id = 0;
for (i = 1; i < 5; i++)
{
xevents_array[i] = xevents_array[0];
xevents_array[i].id = i;
}
Notice that, for sanity, I did not set everything to zero, but instead set everything to index of the axis. Also note that the code is exactly the same in C# and C++.
Part IV: display the contents of the array – before incrementing (or setting) any values
In C#
Console.WriteLine("Before ");
for (i = 0; i < 5; i++)
for (j = 0; j < 5; j++)
Console.WriteLine(xevents_array[i].id
+ " "
+ xevents_array[i].xy_array[j].ytime);
In C++
std::cout << "Before "
<< std::endl;
for (i = 0; i < 5; i++)
for (j = 0; j < 5; j++)
std::cout << xevents_array[i].id
<< " "
<< xevents_array[i].xy_array[j].ytime << std::endl;
There is nothing strange here other than the syntax. The output is the same.
Before
0 0
0 1
0 2
0 3
0 4
1 0
1 1
1 2
1 3
1 4
2 0
2 1
2 2
2 3
2 4
3 0
3 1
3 2
3 3
3 4
4 0
4 1
4 2
4 3
4 4
Part V: increment the counter
In C#
xevents_array[0].id = 454; xevents_array[0].xy_array[0].ytime = 454;
In C++
xevents_array[0].id = 454;
xevents_array[0].xy_array[0].ytime = 454;
Here is where I get tongue in cheek. This is because when I noticed the issue I was doing the normal "++" thing. Until I noticed that it was acting poorly. Notice that the syntax – again is exactly the same. Because I know how screwy the results were, I changed the instruction to a simple assignment of a specific index value. The real code is an increment of a counter.
Part VI: display the contents of the array – after incrementing
In C#
Console.WriteLine("After ");
for (i = 0; i < 5; i++)
for (j = 0; j < 5; j++)
Console.WriteLine(xevents_array[i].id
+ " "
+ xevents_array[i].xy_array[j].ytime);
After
454 454
454 1
454 2
454 3
454 4
1 454
1 1
1 2
1 3
1 4
2 454
2 1
2 2
2 3
2 4
3 454
3 1
3 2
3 3
3 4
4 454
4 1
4 2
4 3
4 4
Even though the set command had both indexes hard coded- each first Y of each X was set to 454!
In C++
std::cout << "After "
<< std::endl;
for (i = 0; i < 5; i++)
for (j = 0; j < 5; j++)
std::cout << xevents_array[i].id
<< " "
<< xevents_array[i].xy_array[j].ytime
<< std::endl;
After
454 454
454 1
454 2
454 3
454 4
1 0
1 1
1 2
1 3
1 4
2 0
2 1
2 2
2 3
2 4
3 0
3 1
3 2
3 3
3 4
4 0
4 1
4 2
4 3
4 4
Part VII: a discussion about what the compiler did
In C#
We got what we wanted in the definition and the declaration of the size of the array. And we even got the initialization exactly as we expected.
Where C# went "south" is in the assign. Any and all attempts to get 0,0 to be set and not anything else have failed.
In C++
We got exactly what we wanted. We got what we would get in Fortran, Pascal, Cobol, VB – all languages I have had the pleasure of solving this task in.
My conclusion is that the .Net C# Compiler has a booboo.