Click here to Skip to main content
13,089,246 members (65,397 online)
Rate this:
Please Sign up or sign in to vote.
See more:
Is there anybody who can give me a direction to write a code that calculate the pi number by using OpenCL..If you have any Pi calculator sample code, please share with me..Thanks for your great interest..
Posted 28-Jul-11 23:51pm
Richard MacCutchan 29-Jul-11 6:31am
Search for math papers on how PI is calculated. Turning such formulae into code should not be too difficult.
caglarozbek89 29-Jul-11 6:36am
I have already serched the math papers and got the logic of calculation but there are some missing points about writing the OpenCL code..C++ is ok for compiling successfully but unfortunately OpenCL not..
SAKryukov 2-Aug-11 5:11am
Do you need to employ parallelism?
caglarozbek89 5-Aug-11 7:53am
Yes, exactly..
Rate this: bad
Please Sign up or sign in to vote.

Solution 2

The first question is "How many decimal places?" 1000000, 10000000, 100000000, 1000000000... ? This will influence the choice of an algorithm.

Then there is the issue of parallelizing the algorithm. Possibly by just by parallelizing the extended-precision arithmetic primitives (+, -, *, /, sqrt). Maybe has this already been done...
caglarozbek89 29-Jul-11 9:53am
Of course the decimal places influence the algorithm but for me 20 digit is enough.
Rate this: bad
Please Sign up or sign in to vote.

Solution 3

Then parallelizing the extended-precision arithmetic is pointless.

I recommend using the Machin formula:

16 arctg(1/5) - 4 arctg(1/239),

where the arctg are evaluated using McLaurin expansion

arctg(x) = Sum (-1)^i x^(2i+1)/(2i+1).

(compute the powers of x by recurrence.)

20 digits can nearly be handled by a 64 bits integer (fixed-point), but you'll need a bit more. With three 32 bits integers, you are on the safe side. For convenience, you can work in arithmetic base 10000000000.

You'll need to implement long addition, multiplication, and division by a small integer. Given the small operand length, it is probably worthless to use efficient multiplication algorithms (like Karatsuba).

Some hint for parallelization:

- let every processor compute a range of consecutive terms; every processor will need to start at some power of x (powers will be 2kN+1, 2kN+3, 2kN+5... for processor k among N) hence the need for fast power computation (by squarings) to initialize.

- alternatively, a processor accumulates every N other terms (powers 2k+1, 2N+2k+1, 4N+2k+1, 6N+2k+1... for processor k among N), multiplying every time by x^2N.

Below a very crude implementation of Machin's formula in Python, floating-point:

def ArcTg(X):
    Sum= X
    Term= X
    Y= - X * X
    for I in range(3, 17, 2):
        Term*= Y
        Sum+= Term / I
    return Sum
print 16 * ArcTg(1. / 5) - 4 * ArcTg(1. / 239)
caglarozbek89 2-Aug-11 3:56am
Thank you for your solution but my problem is not about the logic of the pi calculation as an algorithm..My problem is about the OpenCL language structure on pi calculation..That's why if you suggest a solution on OpenCL I am really appraciated..
YvesDaoust 2-Aug-11 4:09am
If you show your code and the places where compilation errors are raised, maybe I can help you further.
YvesDaoust 2-Aug-11 4:09am
Do you need advice about the OpenCL language syntax or about parallelizing the algorithm ?
Rate this: bad
Please Sign up or sign in to vote.

Solution 4

Please visit my web site @[^] This page from my blog is a cursory comparison of CUDA vs. OpenCL using just your example, estimating the value of pi.

Basically, the solution is via numerical integration using the Composite Simpson's Rule. The solution uses IEEE 754 single-precision float points, so it is very limited in precision.

If you have questions, or cannot get it to run, please let me know.

Ken Domino
caglarozbek89 5-Aug-11 5:15am
Thank you so much..I wisited your web site then understood the method that was used..It was the more understandable method then method generating random numbers on the circle..But I have some difficulties in covering the titles..6(A) and 6(B) are about the OpenCL but I couldnt realize how can I compile those codes..Normally we should have at least a .cpp extension file and a .cl extension file..But there is not...
Ken Domino 5-Aug-11 7:51am
Ah... Figure 6(a) and (b) are both .cpp files. You can put them in separate files, or just merge them into one large .cpp file. To compile, you need to add a "-I" include option for finding the OpenCL include files. To link, you need to add a reference to the OpenCL library.

The example does not have a ".cl" file because it would then have to be read into a string (char *) for clLoadProgramWithSource(), using for example oclLoadProgSource()--which is not part of OpenCL. Instead, the kernel code is set as a string, which is passed to clLoadProgramWithSource(). There is also a bunch of code in Figure 6(b) in an "#else ... #endif" block. That code is for reading binary from a hacked version of the CLCC OpenCL compiler, but I don't use in this example. For clarity, I removed that stuff, and combined the code like I said. I'll updated the blog to contain a link to a MSVC Project to make it easier to build.

The MSVC++ Project is here:

caglarozbek89 5-Aug-11 7:56am
Thanks for your help..Now I am trying to establish Makefile for your code..Then I will compile..
caglarozbek89 5-Aug-11 8:02am
I am trying to implement that code for linux..

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month

Advertise | Privacy |
Web03 | 2.8.170813.1 | Last Updated 5 Aug 2011
Copyright © CodeProject, 1999-2017
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100