Click here to Skip to main content
Click here to Skip to main content

Pointer Arithmetic and Portable Code

By , 3 Mar 2004
 

Introduction

Once, one of my students after completing her degree, went to a job interview and got this question in her test.

    char* pC = "Hello World";

    int* pInt = (int*)pC;
    ++pInt;
    char* pChar = (char*)pInt;

    cout << *pChar << endl;

She was asked to guess the output of this program. She tried her best to answer the question. After coming back home, she contacted me to confirm her understanding. I was surprised to see this code, due to two reasons. First, the cout clearly shows that code is written in C++, so it is not recommended to use C style cast in C++ code [MEY98]. They should use new C++ style cast, so first improvement in the code should be something like this:

    char* pC = "Hello World";

    int* pInt = reinterpret_cast<int*>(pC);
    ++pInt;
    char* pChar = reinterpret_cast<char*>(pInt);

    cout << *pChar << endl;

Although this code is now better than previous one and standard C++ code, which will compile on any standard C++ compiler, it is not portable. The output of this code depends on the platform on which this program will run. According to the Standard of C++, section 3.9.1.2, "Plain integers have the natural size suggested by the architecture of the execution environment." [ISO98].

Well, one might think of using the sizeof operator. Wait before we discuss the problems of sizeof, remember you are doing pointer arithmetic here and addition of 1 in integer pointer is not add one in its address. In addition, the output of sizeof is also not portable across different platforms. According to section 5.3.3 of C++ standard, "the result of sizeof applied to any other fundamental type is implementation defined."[ISO98]. Here, any other means other than char, signed char and unsigned char types.

The increment of pointer is 4 bytes on 32 bits platform and 2 bytes on 16 bits platform. The output of this program is "o" where the size of integer is 4 and "l" where the size of integer is 2 bytes.

This is not limited to character pointer only, in fact the size of bool and wchar_t is also implementation dependent [ISO98] and any code assuming any assumption about its size are not portable.

    char* pC = "Hello World";
    pC += sizeof(int);

    cout << *pC << endl;

And similarly, this code is not portable too:

    char* pC = "Hello World";
    pC += sizeof(bool);

    cout << *pC << endl;

It is even worst when you call a function, which internally uses pointer arithmetic and you pass different types as parameters to it.

void fun(wchar_t* pC)
{
    int iLen = strlen(reinterpret_cast<char*>(pC));
    // do something
}

The value of iLen is one where wchar_t is implemented as multi byte characters instead of the actual length of the string, because NULL is placed after each character of the string. Some situations are even more dangerous when you try to write in memory using pointer arithmetic directly or indirectly. One such example is:

void fun(wchar_t* pC1, wchar_t* pC2)
{
    // Do something
    strcpy(reinterpret_cast<char*>(pC1), 
            reinterpret_cast<char*>(pC2));
    // Do something
}

This code may run correctly on some platforms where char and wchar_t are same but it may crash on some of them where these are not same. Write portable code across all the platforms. Do not assume anything about the size of fundamental types and be careful when using pointer arithmetic.

Thanks to Mahwish Waheed Khan to share her experience and give me example code, which is not portable across platforms.

References

  1. [ISO98] International Standard Programming Language C++ ISO/ICE 14882
  2. [MEY98] Effective C++ 50 specific ways to improve your programs and design, 2nd edition, Scott Meyers

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Zeeshan Amjad
Software Developer (Senior) Bechtel Corporation
United States United States
Member
Working as a C++ Developer at Bechtel Corporation.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralIt's really C code that used cout instead of printfmemberscienceprogrammer18 Jan '09 - 18:43 
Since clearly that's a c style string and not a char array. lol
just makes more sense that way if she's using C and printf the whole program is very typical. And most of the article would just say o on most machines but it's dependent on the architecture you're on and how many bytes are in an int.
Generalreinterpret_cast&lt;type *&gt;(...)&amp;(type *)...memberTKD10 Mar '04 - 0:25 
Hi.
I testing your program in VC6.0
I find that there isn't difference between using reinterpret_cast(variable of pointer) and (type *)variable of pointer.
I see the disassembly:
(1)int* pInt = reinterpret_cast(pC);
mov eax,dword ptr [ebp-4]
mov dword ptr [ebp-8],eax
(2)int* pInt = (int *)(pC);
mov eax,dword ptr [ebp-4]
mov dword ptr [ebp-8],eax
There isn't difference between those.

 
The only educated men are self-educated.--J.Bennett

GeneralRe: reinterpret_cast&lt;type *&gt;(...)&amp;(type *)...memberMike Dimmick10 Mar '04 - 1:32 
No, there isn't, and there shouldn't be.
 
However, there are two main problems with the C-style cast. The first is that it can be quite difficult to see in an expression, causing problems for maintenance programmers. The second is that it's uncontrollable - many semantically different operations are possible with the same syntax. It requires the programmer to understand the deep semantics of the language in order to understand the cast. This can make it difficult to determine the programmer's intent.
 
This is the reason for the separation of cast keywords in C++. const_cast is only ever used for removing const (and volatile) qualifiers. dynamic_cast allows safe casting down or across an object hierarchy by examining run-time type information (not available with a C-style cast). static_cast offers conversions that invert the promotion rules (e.g. from an int to a char), and 'unsafe' casting down a hierarchy without examining RTTI (it's also permissible, but not necessary, to use static_cast to perform conversions that would happen automatically). reinterpret_cast is for situations where a pattern of bits should be interpreted as a different type.
 
Using the new cast operators can make the maintenance programmer's life a lot easier.
 
Stability. What an interesting concept. -- Chris Maunder
QuestionIs sizeof(char) guaranteed to be one byte/machine word?memberDon Clugston1 Mar '04 - 14:14 
I don't have access to the standard, so I'm not sure about what's technically permissible.
In Microsoft's docs for sizeof() in VS 7.1, it states:
"The sizeof operator yields the size of its operand with respect to the size of type char." This leaves open the possibility of a bizarre compiler where a char is (say) 16 bits, and sizeof(int) returns 2 for a 32-bit int, and yet __pointers to bytes still exist__. In this case when you increment a char * pointer, its internal value would increase by 2.
 
In my application, I need to increment a void * pointer by 1. How to do this portably?
You can't cast it to an int and then increment it, because void * and int might not be the same size. So I've been casting to char *, incrementing, then casting back.
Is this effect guaranteed by the standard?
 
-Don.
AnswerRe: Is sizeof(char) guaranteed to be one byte/machine word?memberedger1 Mar '04 - 14:37 
The size of a char is sure to be one byte(i.e. 8 bits),regardless of the compiler, while the size of int may be 16 bits or 32 bits depending on the compiler.
GeneralRe: Is sizeof(char) guaranteed to be one byte/machine word?memberDon Clugston1 Mar '04 - 17:12 
I've done a web search, and discovered that's not always true:
 
eg on TI C30/C40 DSP's
sizeof(char) = sizeof(float)= sizeof(long) = sizeof(int) = 1
 
1 just happens to be 32 bits, which is the smallest adressable unit.
 
sizeof(char) always has to be 1, but I'm not sure that it has to be one byte. But it does seem that
char *p; p++; will always increment (int)p by 1.
Systems as obscure as this one are probably not worth worrying about. Interesting, though.

Generalmissing the point of the testmemberHarold Bamford1 Mar '04 - 5:21 
I completely agree with you wrt. coding style and portability of the test code.
 
But what do you suppose is the purpose of the test code? I would guess that the company was trying to determine if your student really understood what happens with pointer arithmetic.
 
Further, the style of coding is something you might very well find in legacy code.
 
It isn't an unreasonable test, IMHO.
GeneralRe: missing the point of the testmemberWREY1 Mar '04 - 7:07 
I agree, in that it looks like the main reason for the test was to see if the person understood pointer arithmetic, plus one more thing. AAMOF, I had to squint a couple of times before realizing the trick behind the question, in that on a Win16 machine, the result would be different than on a Win32 one (due to the size for type 'int').
 
I believe the company was asking a multiple question in the form of one, because by stating that the answer would be different depending on the OS, they would know that you are responding in part to the boundary alignment aspect of the question. If then they were to state that the OS was Win32, then the part about pointer arithmetic would have been the issue.
 
Suspicious | :suss:
 
William
 
Fortes in fide et opere!
GeneralRe: missing the point of the testmemberHarold Bamford1 Mar '04 - 7:13 
There is indeed that 'gotcha'! It could, however, be that the examiner hadn't even thought about that aspect. My father always says, "If something could be subtle or it could be stupid, bet on Stupid!"
 

GeneralRe: missing the point of the testmemberWREY1 Mar '04 - 10:52 
A wise man (your father)!
 
Big Grin | :-D
 
William
 
Fortes in fide et opere!

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web02 | 2.6.130523.1 | Last Updated 4 Mar 2004
Article Copyright 2004 by Zeeshan Amjad
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid