Click here to Skip to main content
14,110,477 members
Rate this:
 
Please Sign up or sign in to vote.
See more:
Multiplying two Matrices A[4x3] and B[3x4].
Resultant Matrix C[4x4].

no of parallel operations per block is 4.
Therefore,
THREADS_PER_BLOCK = 4

dim3 dimBlock(THREADS_PER_BLOCK, THREADS_PER_BLOCK);
dim3 dimGrid(B.width/dimBlock.x, A.height/dimBlock.y);
MatrixMultKernel<<<dimGrid, dimBlock>>>(d_A, d_B, d_C);


CUDA reports "invalid configuration error". need help in this regard.
Posted
Updated 3-Apr-10 8:44am
v2
Rate this: bad
 
good
Please Sign up or sign in to vote.

Solution 2

Using CUBLAS we can multiply two arbitrarily sized Matrices.
For more details CUBLAS[^]
   
v2
Comments
Addy Tas 13-Jan-12 17:07pm
   
Seems a bit late but i figured; while I'm reading it, why not fix the link.
Cheers, AT
Rate this: bad
 
good
Please Sign up or sign in to vote.

Solution 1

try this link[^]

Hope it helps
   

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

  Print Answers RSS
Top Experts
Last 24hrsThis month


Advertise | Privacy | Cookies | Terms of Service
Web03 | 2.8.190518.1 | Last Updated 13 Jan 2012
Copyright © CodeProject, 1999-2019
All Rights Reserved.
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100