Click here to Skip to main content
15,867,686 members
Please Sign up or sign in to vote.
5.00/5 (1 vote)
Hello everyone,


Two questions after readnig this article,

http://www.microsoft.com/msj/0298/hood0298.aspx

1.

why using LEA to do multiplication is faster than using MUL?

"Using "LEA EAX,[EAX*4+EAX]" turns out to be faster than the MUL instruction."

2.

"The TEB's linear address can be found at offset 0x18 in the TEB." -- what means linear address? Something like array, which elements are put next to each other? What means non-linear address?


thanks in advance,
George
Posted

George_George wrote:
why using LEA to do multiplication is faster than using MUL?


You'd have to know about the internal architecture and circuitry of the CPU to answer that; I don't and I doubt there would be many people except for people that work (or have worked) at Intel that would.

George_George wrote:
"The TEB's linear address can be found at offset 0x18 in the TEB." -- what means linear address? Something like array, which elements are put next to each other? What means non-linear address?


To understand what's going on here you have to know a little about Intel CPUs and segment registers. Basically C/C++ has no concept of segment registers and such (it assumes a linear address space) so this is a page-table mapping trick done by the OS to make the TEB addressable in such an environment.

 
Share this answer
 
1.
From the same article, below your quoted sentence.

The LEA instruction uses hardwired address generation tables that makes multiplying by a select set of numbers very fast (for example, multiplying by 3, 5, and 9). Twisted, but true.

That means LEA instruction is faster than MUL only for a small set of multipliers's value.

2.
George_George wrote:
The TEB's linear address can be found at offset 0x18 in the TEB." -- what means linear address? Something like array, which elements are put next to each other?


I think it means direct address, i.e.
mov         eax,dword ptr fs:[00000018h]
load eax with the address of TEB, hence the following instruction
mov         eax,dword ptr [eax+24h]
loads eax with value found at offset 0x24 int the TEB (the Thread ID).

George_George wrote:
What means non-linear address?

I suppose it is indirect addressing (via FS register in this context).
:)

 
Share this answer
 
Some more read material: Pentium Optimization Cross-Reference[^].

From the page: LEA is better than SHL on the Pentium because it pairs in both pipes, SHL pairs only in the U pipe.

Also, as CPallini pointed out, the document states that lea can be beneficial than mul only when multiplied by 2, 3, 4, 5, 7, 8, 9.

 
Share this answer
 
George_George wrote:
why using LEA to do multiplication is faster than using MUL?

"Using "LEA EAX,[EAX*4+EAX]" turns out to be faster than the MUL instruction."


lea seems to be an addressing instruction, which means it probably must execute in a single cycle and therefore would surely be faster than mul. On the other side, because it is an addressing instruction, think if you will be able to multiply very large numbers this way. There must be surely limitations on that. mul can work with large numbers also. :)

Thanks for the question, the search for an answer got me an interesting read. :)

Link: Wikibooks->Reverse Engineering->CodeTransformations->Common instruction substitutions[^]

Warning: I'm not a full time assembly programmer and I may not be accurate.

My assumptions: An X86 Processor, pentium class.

 
Share this answer
 


CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900