|
Hi all,
I'm in the process of writing my own Matrix Library, and have been using the standard way to copy things around, i.e. if i have a pointer to a bunch of doubles, say
T * my_doubles = new T[64];
representing an 8 by 8 array, then if i want a copy constructor of a class to duplicate this then i just issue the following code in the copy constructor:
for(unsigned int count = 0; count < 64; count++)
new_doubles[count] = my_doubles[count];
however, i have been reading around on the net that there are faster ways to copy memory, mainly using memcpy. i have also learned that some people are against the use of this because the compiler optimizes code such as that above to be the most efficient.
so what's it to be?... regarding SPEED, is it really best to let the compiler optimize, or use memcpy, or some other method?... and if so what are these methods?
Many thanks,
Paul
|
|
|
|
|
Why don't you do some tests?
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
-- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
-- Iain Clarke
[My articles]
|
|
|
|
|
Hmm... Fastest way to copy memory.
I was thinking of a memory copying API tied to a Hayabusa.
It is a crappy thing, but it's life -^ Carlo Pallini
|
|
|
|
|
Indeed: My pen drive is quite fast memory when I drive the GSR..
BTW: nice THHB action, today...
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
-- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
-- Iain Clarke
[My articles]
|
|
|
|
|
|
So, you mean to say that we cannot talk among ourselves? Did you not see the joke icon even? Why the "unhelpful" vote?
Take it easy, those replies were not given to you.
It is a crappy thing, but it's life -^ Carlo Pallini
|
|
|
|
|
Rajesh R Subramanian wrote: Why the "unhelpful" vote?
It doesn't come from him. It comes from his friend...
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
-- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
-- Iain Clarke
[My articles]
|
|
|
|
|
hey, i didn't write your replies for you. chill.
however, i did do some tests, and memcpy is way faster - in fact, wherease memcpy comes in at 0 clocks (for some small copy routine), both the pointer copy and index copies both come it at around 1300 cycles.
WOW !
|
|
|
|
|
I actually suggested you to perform some tests. What's better than experimental evidence?
As a guess, I suppose that a library function maybe faster than your (even optimized) code. I would try a Win32 API function like CopyMemory (if code portability is not a concern) [^].
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
-- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
-- Iain Clarke
[My articles]
|
|
|
|
|
thanks - the memcpy function seems lightning fast. for a small copy routine, memcpy came in at 0 cycles whereas index and pointer copies came in around 1300 cycles for the same test..
i'm sticking with memcpy ! !
cheers,
|
|
|
|
|
Rajesh R Subramanian wrote: tied to a Hayabusa
Is this how your little monkeys get their answers so fast?
Best Wishes,
-David Delaune
|
|
|
|
|
Usually, the steroid injected, genetically altered bananas do the magic. But yes, sometimes they need a superbike.
It is a crappy thing, but it's life -^ Carlo Pallini
|
|
|
|
|
I did test about this before:
pointer increasing is faster than index increasing.
i.g.
WORD*p0=...;
WORD*p1=...;
this code is faster:
for(i=0;i<1000;i++)
{
*p0=*p1;
p0++;
p1++;
}
than:
for(i=0;i<1000;i++)
{
p0[i]=p1[i];
}
|
|
|
|
|
thanks, i'll try something similar to this. cheers for helpful comments.
|
|
|
|
|
such timing experiments are tricky. Here are some things to consider:
- compiler settings, including debug vs release
- timer inaccuracies; make sure to have a timer with good accuracy
- cache effects; when testing several alternatives, the first will often loose because the data hasn't been cached yet.
etc.
Luc Pattyn [Forum Guidelines] [My Articles]
The quality and detail of your question reflects on the effectiveness of the help you are likely to get.
Show formatted code inside PRE tags, and give clear symptoms when describing a problem.
|
|
|
|
|
memcpy is generally fastest. I've found several places where VS2008 (at least) optimized a loop like above into a memcpy.
|
|
|
|
|
I chose a few memory copying functions and did a profiling, and I see memcpy is the fastest (VS 2008). I haven't gone to the extent to see what the compiler has optimized it out into; I only did profiling, several times. memcpy was slightly faster on an average.
Also, a modern compiler must be able to optimize your code out to something that's fastest, unless of course you write some deliberate bad code.
It is a crappy thing, but it's life -^ Carlo Pallini
|
|
|
|
|
Many compilers (including MS C++) support intrinsic function calls which are inlined during the compile phase. This includes calls to memcpy, memcmp, memset, strcpy, acos, asin and more. If you have not already enabled intrinsic functions you might wanna give it a try.
For more information on intrinsic functions click --> here <--
1300 calories of pure beef goodness can't be wrong!
|
|
|
|
|
I knew of intrinsic functions, but I hadn't known that memcpy was amongst the list of functions that the MS C++ compiler has intrinsic versions of.
Thanks.
It is a crappy thing, but it's life -^ Carlo Pallini
|
|
|
|
|
memcpy will be faster, in general.
and if you want to get crazy, an SSE/MMX version of memcpy can be even faster.
|
|
|
|
|
Here is info copied from google search:
============================
FILETIME
Introduction It shows elapsed time from 00:00:00 on Jan 1, 1601. It is expressed per 100 nano-seconds.
conversion fomula from FILETIME to time_t is
( FILETIME - 0x19DB1DED53E8000 ) / 10000000;
Recordable range from : 00:00:00 on Jan 1, 01601 ( 00000000 : 00000000 )
to : 14:36:10 on Mar 28, 60056 ( FFFFFFFF : FFFFFFFF )
=============================
from info above, unit in FILETIME is 100 nano-seconds, or 10^(-4) millisecond.
Is my understanding about FILETIME unit correct?
If yes, how do we generate a so small time value for the unit?
|
|
|
|
|
Reported info is correct, see [^].
includeh10 wrote: from info above, unit in FILETIME is 100 nano-seconds, or 10^(-4) millisecond.
Is my understanding about FILETIME unit correct?
If you're curious about, try to call the QueryPerformanceFrequency [^] function on your system.
Anyway, I guess the system (HW+OS ) is allowed to be less accurate than the .1 microsec interval specified by the FILETIME struct. For instance suppose the system being accurate just up to 1 millisecond interval, still the FILETIME struct info is valid (you'll get always 4 zeroes at the end).
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
-- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
-- Iain Clarke
[My articles]
|
|
|
|
|
Yes, the units are 100ns. So, to convert a FILETIME to seconds, multiply by 10,000,000.
includeh10 wrote: If yes, how do we generate a so small time value for the unit?
? You don't. That's the point - the units of FILETIME are small enough that 1 FILETIME is smaller than just about any time you'll ever want to express.
Java, Basic, who cares - it's all a bunch of tree-hugging hippy cr*p
|
|
|
|
|
Stuart Dootson wrote: That's the point - the units of FILETIME are small enough that 1 FILETIME is smaller than just about any time you'll ever want to express.
Not valid for me: I reach almost the light speed on my GSR...
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
-- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
-- Iain Clarke
[My articles]
|
|
|
|
|
CPallini wrote: Not valid for me: I reach almost the light speed on my GSR
Is too valid! - c = 30m/FILETIME!
Java, Basic, who cares - it's all a bunch of tree-hugging hippy cr*p
|
|
|
|