Click here to Skip to main content
15,907,281 members
Please Sign up or sign in to vote.
3.00/5 (1 vote)
See more:
How do I convert this code into inline assembly?

C++
void tom::Transform(void* btr)
{   
    short* block =(short*)btr;
    __declspec(align(8)) __m64*block1 = (__m64*)block;  
		 int j;   
	
	/// Only do IT modes = 0 & 1.
	if(_mode != QuantOnly)	//< 2 = Q only.
	{
            __m64 s0,s1,s2,s3,f0,f1,f2,f3,temp4,temp5,temp6,temp7;
            j=0;
	    // transpose input
	    temp4 = _mm_unpacklo_pi16(block1[j],block1[j+1]);
	    temp5 = _mm_unpacklo_pi16(block1[j+2],block1[j+3]);
	    temp6 = _mm_unpackhi_pi16(block1[j],block1[j+1]);
	    temp7 = _mm_unpackhi_pi16(block1[j+2],block1[j+3]);
	    f0 = _mm_unpacklo_pi32(temp4,temp5);
	    f2 = _mm_unpacklo_pi32(temp6,temp7);
	    f1 = _mm_unpackhi_pi32(temp4,temp5);
	    f3 = _mm_unpackhi_pi32(temp6,temp7);
	    // stage one
	    s0 =_mm_add_pi16(f0,f3);
	    s3 =_mm_sub_pi16(f0,f3);
	    s1 =_mm_add_pi16(f1,f2);
	    s2 =_mm_sub_pi16(f1,f2); 
	    //stage 2
            block1[j] =_mm_add_pi16(s0,s1);
            block1[j+2] =_mm_sub_pi16(s0,s1);
            block1[j+1] =_mm_add_pi16(s2,_mm_slli_pi16(s3, 1));
            block1[j+3] =_mm_sub_pi16(s3,_mm_slli_pi16(s2, 1));


Thanks in advance!
Posted
Updated 12-Feb-11 9:53am
v2
Comments
Manfred Rudolf Bihy 12-Feb-11 17:10pm    
Edit: Code tags added!
Andrew Brock 13-Feb-11 3:14am    
Just note that inline assembly is not supported by the 64 bit C/C++ Compiler.
If your code needs to work compile as 64 bit (32 bit compiles can run on 64 bit computers) you need to either use #ifdef/#else/#endif to use the inline assembly on 32 bit compiles and the C/C++ for 64 bit compiles or write the assembly in a .asm file.

As for how to do it, I'm not too familiar with MMX/SSE.
All I can say is that it will start with __asm { /*assembly code here*/ }
MMX registers are mm0, mm1, ...
To load/save data to/from the MMX registers you use movq

1 solution

But what is the great benefit of changing the MMX intrinsics used now to inline assembly? using MMX intrinsics is already using the MMX capabilities of your processor and it won't really help to convert it to inline assembly. Have a look at the following article. You will notice there is no difference measured between MMX intrinsics and MMS inline assembly.
Introduction to MMX Programming[^]

Good luck!
 
Share this answer
 
Comments
SMART LUBOBYA 14-Feb-11 9:57am    
no difference? see this aticles:http://blog.graphtech.co.il/experiments-with-intels-sse-simd-instruction-set/
SMART LUBOBYA 14-Feb-11 9:57am    
i think there is a difference
E.F. Nijboer 14-Feb-11 10:12am    
Well, this link does indeed show a difference. Although I am wondering why the example given in the article on the codeproject is not subjected to difference. It could be that the example at the link you gave didn't use release configuration with with compiler optimizations. On the other hand it could also be that for the article on the codeproject the image used was simply to small to point out any difference.
What the link did provide is an example of how to convert standard c++ to mmx intrinsics to mmx assembly, meaning you basically found the answer to your own question :-D

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900