Click here to Skip to main content
15,881,248 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I have done a VS2013 project to test opencl at github OpenCL dir:

GitHub - jlopez2022/cpp_utils: Example of c++ programs[^]


In that example I calculated differential rms of a big vector (200mega size), then on CPU and debug mode it calculated at 100 Megaops/data

At CPU and release mode the calculus was about 400 Mops/data (so I supose it used the 4 cores in parallel).

Then I checked also on GPU and obtained 600 Mops/data

So in theory if I should use a 6 core cpu I should overcome the GPU processing unless CPU-GPU bandwith wuld be increased

The CPU was a 4 core E5 3.5Ghz
The GPU was a Radeon R9 390 with 2560 cores and 1Ghz

In theory the GPU is 182 times faster than the CPU but I supose unfortunately CPU needs loads of time to copy data to GPU memory

What I have tried:

GitHub - jlopez2022/cpp_utils: Example of c++ programs[^]
Posted
Updated 29-Aug-17 1:56am

1 solution

The speed doubling in release mode is not sourced by parallel processing. It is sourced by the compiler optimising the code in release mode and omitting additional checks which are done in debug builds.

You have to explicitly write code for parallel processing.

Which method is finally faster (CPU code with parallel processing or GPU code) can't be answered in general. The only reliable method is to implement both and compare the results. But it depends on the used hardware (CPU / GPU, number of threads and cores) so that the results are different for different systems.

For such cases you can still implement both and provide an user option to select the method or perform short tests to let your application choose the fastest method.
 
Share this answer
 
Comments
CPallini 29-Aug-17 8:41am    
5.
Javier Luis Lopez 30-Aug-17 2:50am    
You may be right, in my code there is a lot of memory transfer between CPU and GPU, that is typical in video applications. But if there is a lot of operations for every data the GPU will be a good option.

>For such cases you can still implement both and provide a user option to select the method or perform short tests to let your application choose the fastest method.

In my code at github there are the two options. Anybody can modify it to test gpu vs cpu methods

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900