Click here to Skip to main content
15,896,269 members

Comments by wuling (Top 6 by date)

wuling 4-Sep-12 21:42pm View    
dear Kryukov,
If I don't use tbb, I get right result.
But I use tbb, then i get wrong.
I don't know what's happening.
wuling 20-Aug-12 0:43am View    
Dear Graus:The result is I can't debug at this case.The only way is:

User control Complier:"Any CPU"
Main Pogram Complier:"Any CPU"


How to call dll(X64) File?? below list is p/invoke CUDA dll file and


using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.InteropServices;namespace CudaPinvoke{

public static class CDLL {

internal const string CUDA_DRIVER_API_DLL_NAME = "nvcuda";

[DllImport("cudadll.dll", CallingConvention = CallingConvention.Cdecl)]

public static extern CUResult addWithCuda(int[] c, int[] d, int[] a, int[] b, ushort size);

[DllImport("cudadll.dll", CallingConvention = CallingConvention.Cdecl)]

public static extern CUResult addWithCuda1(int[] d, int[] c, ushort size);

[DllImport("cudadll.dll", CallingConvention = CallingConvention.Cdecl)]

public static extern CUResult ScanWithCuda(int[] src, int[] dsc2d, int[] dsc3d, int thresholdvalue, int sdown, int sup, int height, int size);

[DllImport("cudadll.dll", CallingConvention = CallingConvention.Cdecl)]

public static extern CUResult ScanWithCuda(byte[] src, int[] dsc2d, int[] dsc3d, int thresholdvalue, int sdown, int sup, int height, int size);

[DllImport("cudadll.dll", CallingConvention = CallingConvention.Cdecl)]

public static extern CUResult ScanWithCuda(IntPtr src, int[] dsc2d, int[] dsc3d, int thresholdvalue, int sdown, int sup, int height, int size);

[DllImport("cudadll.dll", CallingConvention = CallingConvention.Cdecl)]

public static extern CUResult gpuDeviceInit(int devID);

[DllImport(CUDA_DRIVER_API_DLL_NAME)]

public static extern CUResult cuDeviceGetProperties(ref CUDeviceProperties prop, CUdevice dev);

}}

and here is emgu p/invoke to call dll


[DllImport(OPENCV_IMGPROC_LIBRARY, CallingConvention = CvInvoke.CvCallingConvention)]

public static extern void cvLaplace(IntPtr src, IntPtr dst, int apertureSize);


How to do to call dll file at "Any CPU" platform???
wuling 19-Aug-12 22:31pm View    
Dear Graus:
I think i can try:
Complier User control at "Any CPU" platform and don't complier at "x86" & "x64"
Complier Main program at "X64" platform and then call DLL files.
Then I will tell you the result
wuling 19-Aug-12 19:24pm View    
I have tried to build and run X64 exe, and program has some "use control" GUI. If I complier and run X64 exe, and result i get "BadImageExpection" Error. So the only way is to complier "Any CPU" Platform, and i only can debug and run this mode.
wuling 11-Jul-12 0:08am View    
Hi,
Sorry, I try to explain my question.

The original code in have SSE "csse:THRESH_BINARY:",there are two for loop(please see part list 1&2) and do the same job, and store "dst"; the only different only input data is "_mm_loadu_si128"&"_mm_loadl_epi64".
So, I am confused one question is here. Why not use one for loop ???


//part list 1
for( j = 0; j <= roi.width - 32; j += 32 )
{
__m128i v0, v1;
v0 = _mm_loadu_si128( (const __m128i*)(src + j) );
v1 = _mm_loadu_si128( (const __m128i*)(src + j + 16) );
v0 = _mm_cmpgt_epi8( _mm_xor_si128(v0, _x80), thresh_s );
v1 = _mm_cmpgt_epi8( _mm_xor_si128(v1, _x80), thresh_s );
v0 = _mm_and_si128( v0, maxval_ );
v1 = _mm_and_si128( v1, maxval_ );
_mm_storeu_si128( (__m128i*)(dst + j), v0 );
_mm_storeu_si128( (__m128i*)(dst + j + 16), v1 );
}

//part list 2
for( ; j <= roi.width - 8; j += 8 )
{
__m128i v0 = _mm_loadl_epi64( (const __m128i*)(src + j) );
v0 = _mm_cmpgt_epi8( _mm_xor_si128(v0, _x80), thresh_s );
v0 = _mm_and_si128( v0, maxval_ );
_mm_storel_epi64( (__m128i*)(dst + j), v0 );
}

The other question is "#if CV_SSE2" is true or not. In the part list 3 , however,the code must be implemented, but the variable "tab" declare in thresh_8u not in "#if CV_SSE2 ......", you will find the result dst is stored by variable tab, again, right?? So, the original code do the same work using three different method?



//part list 3
if( j_scalar < roi.width )
{
for( i = 0; i < roi.height; i++ )
{
const uchar* src = (const uchar*)(_src.data + _src.step*i);
uchar* dst = (uchar*)(_dst.data + _dst.step*i);
j = j_scalar;
#if CV_ENABLE_UNROLLED
for( ; j <= roi.width - 4; j += 4 )
{
uchar t0 = tab[src[j]];
uchar t1 = tab[src[j+1]];

dst[j] = t0;
dst[j+1] = t1;

t0 = tab[src[j+2]];
t1 = tab[src[j+3]];

dst[j+2] = t0;
dst[j+3] = t1;
}
#endif
for( ; j < roi.width; j++ )
dst[j] = tab[src[j]];
}
}