Something You May Not Know About the Macro in MASM

Zuoliu Ding

4.96/5 (8 votes)

Feb 23, 2016

CPOL

12 min read

31778

257

A discussion on some MASM Macro usages, including ECHO directive, parameter type/size check, and repetitions with location counter $.

Download source code - 4.8 KB

Introduction

A macro is a symbolic name that you give to a series of characters called a text macro or give to one or more statements, called a macro procedure or function. As the assembler evaluates each line of your programs, it scans the source code for the name of an early defined macro, and then substitutes the macro definitions for the macro name. A macro procedure is a named block of assembly language statements. Once defined, it can be invoked (called) many times, even receiving different arguments passed. So you can avoid repeatedly writing the same code in places.

In this article, we’ll talk about some usages that are not thoroughly discussed or not clearly documented. We'll show and analyze examples in the Microsoft Visual Studio IDE. The topics will be related miscellaneously to the ECHO directive, checking the parameter type and size in a macro procedure, and generating memory in repetitions with the current location counter $.

All the materials presented here came from my teaching [2] for years. Thus, to read this article, a general understanding of Intel x86-64 assembly language is assumed, and being familiar with Visual Studio 2010 or above is required. Preferred, having read a textbook typically like [3]; or the MASM Programmer's Guide [5] that was originated in 1992 by Microsoft but still so valuable in today’s MASM learning. If you are taking an Assembly Language Programming class, this could be a supplemental reading or study reference.

Using ECHO to the Output Window

According to MSDN, the ECHO directive displays a message to the standard output device. In Visual Studio, you are able to use ECHO to send a string to the IDE’s Output pane to show assembling/compiling messages, such as warning or error, similar to what MSBuild does.

A simplified snippet to test ECHO would be like this:

.386
.model flat,stdcall
ExitProcess proto,dwExitCode:DWORD

mTestEcho MACRO
   ECHO ** Test Echo with Nothing **  
ENDM

.code
main PROC
   mTestEcho
   invoke ExitProcess,0
main ENDP
END main

This code was working fine in VS 2008. Unfortunately, since VS 2010, the ECHO directive does not result in writing to the output window in Visual Studio. As seen in [4], it doesn’t, unless you configure it to generate verbose output to assembly your code. To do this, you should go:

Tools->Options->Projects and Solutions->Build and Run

And from the "MSBuild project build output verbosity" dropdown box, choose the "Detailed" option (notice that default is "Minimal"):

To test the macro, you only need to compile an individual .ASM file, instead of building the whole project. Simply right click on your file and choose Compile:

To verify, you have to watch the display in the VS Output pane. The ECHO directive is really working, however, the message you interested "** Test Echo with Nothing **" is buried somewhere in hundreds of lines that MSBuild generated. You must search tediously to find:

In learning MASM Assembly programming, this is definitely not a preferred way to practice. What I suggested is to use text "Error:" or "Warning:" with ECHO, while still leaving MSBuild default output setting "Minimal" unchanged.

1. Output as Error

Simply add "Error:" in the ECHO statement in the macro and name it as mTestEchoError:

mTestEchoError MACRO
   ECHO Error: ** Test Echo with Error **
ENDM

Now let’s call mTestEchoError in main PROC. Compiling the code, you can see the minimal output so concise as below. Notice that because of Error here, reasonably the result said failed.

2. Output as Warning

Simply add "Warning:" in the ECHO statement and name it as mTestEchoWarning:

mShowEchoWarning MACRO
    ECHO Warning: ** Test Echo with Warning **  
ENDM

Then calling mTestEchoWarning in main PROC and compiling it, you can see the minimal output much simpler as below. Since only Warning designated, the compiling succeeded.

As you are aware, this way, the ECHO directive generates concise and clear messages without you searching for the outputs. The sample is in TestEcho.asm for download.

Checking Parameter Type and Size

When you pass an argument to a macro procedure, the procedure receives it from the parameter although just a text substitution. Usually, you will check some conditions from the parameter to do something accordingly. Since this happens at assembling time, it means that the assembler would choose some instructions if a condition is satisfied, else provide other instructions for unsatisfied if needed. Definitely, you can check the constant argument values, either in string or number. Yet another useful check perhaps, is based on the parameter type or size for registers and variables. For example, one macro procedure only accepts unsigned integers and bypass signed ones, while the second macro may deal with 16 and 32-bit without for 8 and 64-bit arguments.

1. Argument as a Memory Variable

Let’s define three variables here:

.data
   swVal SWORD    1
   wVal  WORD     2
   sdVal SDWORD   3

When applying the TYPE and <code>SIZEOF operators to these variables, we simply have:

   mov eax, TYPE swVal     ; 2 bytes
   mov eax, SIZEOF swVal   ; 2 bytes
   mov eax, TYPE wVal      ; 2 bytes
   mov eax, SIZEOF wVal    ; 2 bytes
   mov eax, TYPE sdVal     ; 4 bytes
   mov eax, SIZEOF sdVal   ; 4 bytes

As seen above, there is no numeric difference either between TYPE and SIZEOF, or between WORD and SWORD. The first four instructions all are moving the byte count 2 to EAX. However, TYPE can do more than just returning byte counts. Let’s try to check SWORD type and size with the parameter par:

mParameterTYPE MACRO par
   IF TYPE par EQ TYPE SWORD 
      ECHO warning: ** TYPE par is TYPE SWORD
   ELSE 
      ECHO warning: ** TYPE par is NOT TYPE SWORD   
   ENDIF
ENDM   
 
mParameterSIZEOF MACRO par
   IF SIZEOF par EQ SIZEOF SWORD
      ECHO warning: ** SIZEOF par is SIZEOF SWORD
   ELSE 
      ECHO warning: ** SIZEOF par is NOT SIZEOF SWORD    
   ENDIF   
ENDM

Then calling two macros by passing the above defined variables

   ECHO warning: --- Checking TYPE and SIZEOF for wVal ---   
   mParameterTYPE wVal
   mParameterSIZEOF wVal
 
   ECHO warning: --- Checking TYPE and SIZEOF for swVal ---   
   mParameterTYPE swVal
   mParameterSIZEOF swVal
 
   ECHO warning: --- Checking TYPE and SIZEOF for sdVal ---   
   mParameterTYPE sdVal
   mParameterSIZEOF sdVal

See the following results in Output:

Obviously, the TYPE operator can be used to differentiate the signed or unsigned arguments passed, as SWORD and WORD are different types. While SIZEOF is simply a comparison of byte counts, as SWORD and WORD are both 2 bytes. The last two checks means the type of SDWORD is not SWORD and the size of SDWORD is 4 bytes not 2.

Furthermore, let’s make direct checks, since two operators also can apply to data type names here:

mCheckTYPE MACRO
    IF TYPE SWORD EQ TYPE WORD 
        ECHO warning: ** TYPE SWORD EQ TYPE WORD
    ELSE 
        ECHO warning: ** TYPE SWORD NOT EQ TYPE WORD
    ENDIF
ENDM
 
mCheckSIZEOF MACRO
    IF SIZEOF SWORD EQ SIZEOF WORD
        ECHO warning: ** SIZEOF SWORD EQ SIZEOF WORD
    ELSE 
        ECHO warning: ** SIZEOF SWORD NOT EQ SIZEOF WORD   
    ENDIF
ENDM

The following result is intuitive and straightforward:

2. Argument as a Register

Since an argument can be a register, let’s call two previous macros to check its TYPE and SIZEOF:

   mParameterTYPE AL     
   mParameterSIZEOF AL  
   mParameterTYPE AX
   mParameterSIZEOF AX

We receive such messages:

As we see here, for type check, neither AL nor AX (even 16-bit) is signed WORD. Actually, you cannot apply SIZEOF to a register that causes assembling error A2009. You can verify it directly:

    mov ebx, SIZEOF al    ; error A2009: syntax error in expression
    mov ebx, TYPE al

But which type is for registers? The answer is all registers are unsigned by default. Simply make this:

mParameterTYPE2 MACRO par
   IF TYPE par EQ WORD 
      ECHO warning: ** TYPE par is WORD
   ELSE 
      ECHO warning: ** TYPE par is NOT WORD   
   ENDIF
ENDM

And call:

mParameterTYPE2 AL   ; 1>MASM : warning : ** TYPE AL is NOT WORD
mParameterTYPE2 AX   ; 1>MASM : warning : ** TYPE AX is WORD

Also notice that I directly use the data type name WORD here equivalent to using TYPE WORD.

3. An Example in Practice

Now let’s take a look at a concrete example that requires moving an argument of a 8, 16, or 32-bit singed integer into EAX. To create such a macro, we have to use either the instruction mov or the sign-extension movsx based on the parameter size. The following is one possible solution to compare the parameter's type with the required sizes. The %OUT is the same as ECHO as an alternative.

mParToEAX MACRO intVal
   IF TYPE intVal LE SIZEOF WORD         ;; 8- or 16-bit
      movsx eax, intVal
   ELSEIF TYPE intVal EQ SIZEOF DWORD    ;; 32-bit 
      mov eax,intVal
   ELSE
     %OUT Error: ***************************************************************
     %OUT Error: Argument intVal passed to mParToEAX must be 8, 16, or 32 bits.
     %OUT Error:****************************************************************
   ENDIF
ENDM

Test it with different sizes and types for variables and registers:

; Test memory
   mParToEAX bVal       ; BYTE
   mParToEAX swVal      ; SWORD
   mParToEAX wVal       ; WORD
   mParToEAX sdVal      ; SDWORD
   mParToEAX qVal       ; QWORD

; Test registers
   mParToEAX AH         ; 8 bit
   mParToEAX BX         ; 16 bit
   mParToEAX EDX        ; 32 bit
   mParToEAX RDX        ; 64 bit

As expected, the Output shows the following messages to reject qVal reasonably. Also fine is an error reported for RDX, as our 32-bit project doesn’t recognize a 64-bit register.

You can try the downloadable code in ParToEAX.asm. Furthermore, let’s generate its listing file to see what instructions the assembler has created to substitute macro calls. As expected, bVal, swVal, wVal, and sdVal are good but without qVal; while AH, BX, and EDX good but without RDX:

 00000000         .data
 00000000 03            bVal  BYTE     3
 00000001 FFFC          swVal SWORD   -4
 00000003 0005          wVal  WORD     5
 00000005 FFFFFFFA      sdVal SDWORD   -6
 00000009               qVal  QWORD    7
      0000000000000007

 00000000         .code
 00000000         main_pe PROC 
            ; Test memory
               mParToEAX bVal       ; BYTE
 00000000  0F BE 05        1         movsx eax, bVal
      00000000 R
               mParToEAX swVal      ; SWORD
 00000007  0F BF 05        1         movsx eax, swVal
      00000001 R
               mParToEAX wVal       ; WORD
 0000000E  0F BF 05        1         movsx eax, wVal
      00000003 R
               mParToEAX sdVal      ; SDWORD
 00000015  A1 00000005 R     1       mov eax,sdVal
               mParToEAX qVal       ; QWORD

            ; Test registers
               mParToEAX AH         ; 8 bit
 0000001A  0F BE C4        1         movsx eax, AH
               mParToEAX BX         ; 16 bit
 0000001D  0F BF C3        1         movsx eax, BX
               mParToEAX EDX        ; 32 bit
 00000020  8B C2        1            mov eax,EDX
               mParToEAX RDX        ; 64 bit
              1      IF TYPE RDX LE SIZEOF WORD       
AsmCode\ParToEAX.asm(45) : error A2006:undefined symbol : RDX
 mParToEAX(1): Macro Called From
  AsmCode\ParToEAX.asm(45): Main Line Code
              1      ELSE
AsmCode\ParToEAX.asm(45) : error A2006:undefined symbol : RDX
 mParToEAX(3): Macro Called From
  AsmCode\ParToEAX.asm(45): Main Line Code

               invoke ExitProcess,0
 00000029         main_pe ENDP
            END ; main_pe

Generating Data in Repetitions

In this section, we’ll talk about using macros to generate a memory block, an array of integers in the data segment, rather than calling macros in code. We’ll show three ways to create the same linked list: using an unchanged location counter $, retrieving changed values from the counter $, and calling a macro in data segment.

1. Using an Unchanged Locate Counter $

I simply borrowed the LinkedList snippet from the textbook [3] to modify it with eight nodes as:

LinkedList ->11h ->12h ->13h ->14h ->15h ->16h ->17h ->18h ->00h

I added six extra DWORDs of 01111111h at the end for padding, although unnecessary while easy to format in the Memory window to watch:

ListNode STRUCT
  NodeData DWORD ?
  NextPtr  DWORD ?
ListNode ENDS

TotalNodeCount = 8
 
.data
; LinkedList created with $ not changed:
 
   Counter = 0
   LinkedList LABEL PTR ListNode
   REPT TotalNodeCount
      Counter = Counter + 1
      ListNode <Counter+10h, ($ + Counter * SIZEOF ListNode)>
   ENDM
   ListNode <0,0>   ; tail node
   DWORD 01111111h, 01111111h, 01111111h, 01111111h, 01111111h, 01111111h

The memory is created. The list header is the label LinkedList, an alias that points to 0x00404010:

Each node contains 4-byte DWORD for NodeData and another DWORD for NextPtr. As Intel IA-32 using little endian, the first integer in memory 11 00 00 00, is 00000011 in hexadecimal; and its next pointer 18 40 40 00, is 0x00404018. So the two rows cover all eight list nodes. In the third row, the first node with two zero DWORDs acts as a tail (although a waste node). Immediately followed is padding of six 01111111.

Now let’s see what happens to the current location counter $. As mentioned in [3]:

The expression ($ + Counter * SIZEOF ListNode) tells the assembler to multiply the counter by the ListNode size and add their product to the current location counter. The value is inserted into the NextPtr field in the structure. [It’s interesting to note that the location counter’s value ($) remains fixed at the first node of the list.]

This is really true that the value of $ always remains 0x00404010 without changing in each iteration in the REPT block. The NextPtr address calculated by ($ + Counter * SIZEOF ListNode) makes node by node to link together to generate LinkedList eventually. However, you might ask if we could get the actual current memory address to use in iteration? Yes. Here it comes.

2. Retrieving Changed Values From the Location Counter $

.data
; LinkedList2 created with $ changed:

   Counter = 0
   LinkedList2 LABEL PTR ListNode
   REPT TotalNodeCount
      Counter = Counter + 1
      ThisPointer = $
      ListNode <Counter+20h, (ThisPointer + SIZEOF ListNode)>
   ENDM
   ListNode <0,0>   ; tail node
   DWORD 02222222h, 02222222h, 02222222h, 02222222h, 02222222h, 02222222h
   len = ($ - LinkedList)/TYPE DWORD

Hey, almost nothing changed but to name a new symbolic constant ThisPointer = $ that just assigns the $’s current memory address to ThisPointer. Now we can use ThisPointer in the similar calculations to initialize the NextPtr field of ListNode object by a simpler expression (ThisPointer + SIZEOF ListNode). This also makes node by node to link one another to generate LinkedList2 this time. You can check LinkedList2’s memory, 0x00404070:

To differentiate the first LinkedList, I let Counter+20h to make it as:

LinkedList2 ->21h ->22h ->23h ->24h ->25h ->26h ->27h ->28h ->00h

By comparing two memory blocks, both perform exactly the same functionality. Notice that at last, I purposely calculate the len to see how many DWORDs generated until now.

len = ($ - LinkedList)/TYPE DWORD

As an interesting exercise, please think of the value of len in your mind. In code, move len to a register to verify.

3. Calling a Macro in Data Segment

By making the third linked list, we can understand that not only can you call a macro in code, but you can also call one in the data segment. For this purpose, I define a macro named mListNode with a parameter called start, where a ListNode object is simply initialized. To differentiate the previous two, I make Counter+30h for NodeData and assign NodePtr as (start + Counter * SIZEOF ListNode).

.data
; LinkedList3 created with a macro call: 
 
   mListNode MACRO start 
      Counter = Counter + 1
      ListNode <Counter+30h, (start + Counter * SIZEOF ListNode)>
   ENDM
 
   LinkedList3 = $     ; still cannot directly use $, Must get the current value out
   Counter = 0   
   REPT TotalNodeCount
      mListNode LinkedList3   ; What if pass $ as an argument?
   ENDM
   ListNode <0,0>   ; tail node
   DWORD 03333333h, 03333333h, 03333333h, 03333333h, 03333333h, 03333333h

The third list looks like:

LinkedList3 ->31h ->32h ->33h ->34h->35h ->36h->37h ->38h->00h

We now take the lesson from LinkedList2 by having LinkedList3 = $ first at the beginning. Notice I simply use symbolic constant LinkedList3 as the third list header, instead of the LABEL directive. Now I set the REPT repetition with only one macro call by passing the header address LinkedList3 to mListNode. That’s it! See memory at 0x004040D0:

Imagine what if you pass $ as an argument to mListNode, without LinkedList3 = $?

4. Checking an Address and Traversing a Linked List

Finally, let us put all generations of three lists together and run LinkedList.asm (available for download). In the code segment, I first retrieve three list headers’ addresses as below:

   mov  edx,OFFSET LinkedList
   mov  ebx,OFFSET LinkedList2
   mov  esi,OFFSET LinkedList3  ; or directly LinkedList3
   mov  eax, len

As expected EDX, 00404010 is for LinkedList; EBX, 00404070 for LinkedList2; and ESI, 004040D0 for LinkedList3. The whole memory of three lists is neighboring each other as shown:

Notice because of LinkedList3 as a symbolic one, we don’t even have to use the OFFSET operator here. Let’s leave ESI for the LinkedList3 and traverse this list to see every NodeData values with a loop like this:

; Display the integers in the NodeData members.
NextNode:
   ; Check for the tail node.
   mov  eax, (ListNode PTR [esi]).NextPtr
   cmp  eax, 0
   je   quit
 
   ; Display the node data.
   mov  eax, (ListNode PTR [esi]).NodeData
   ; call a PROC to show EAX here
 
   ; Get pointer to next node.
   mov  esi, (ListNode PTR [esi]).NextPtr
   jmp  NextNode 
quit:

Unfortunately, we haven’t involved any implementation of an output procedure to call here to show EAX that NodeData moved. But in your debugging, simply setting a break point there to watch EAX should be enough to verify from 31h, 32h, …, to 38h.

Summary

By scrutinizing the above examples, we exposed something that you may not know about the macro in MASM assembly programming. An assembly language program can be executed with Intel or AMD specified instructions at runtime. While on the other side, MASM provides many directives, operators, and symbols to control and organize the instructions and memory variables during the assembling time, similar to preprocessing in other programming languages. In fact, with all the features, the MASM macro itself could be considered as a sub or mini programming language with three control mechanisms of sequential, conditional, and repetition.

However, some usages of MASM macro have not been discussed in detail. In the article, we first introduced a better way to output your error or warning text making it so easy to trace macro behaviors. Then with if-else structures, we presented how to check the type and size for a macro’s parameter that is a usual practice either for memory or register arguments. Finally, we discussed the macro repetitions with three examples to generate the same linked list, as well as a better understanding to use the current address locator $ symbol. The downloadable zip file contains all samples in .asm files. The project file MacroTest.vcxproj has been created in VS 2010, while it can be opened and upgraded in any recent VS version.

This article does not involve hot technologies like .NET or C#. Assembly Language is comparatively traditional without much sensation. But please refer to TIOBE Programming Community Index, the ranking of Assembly Language is on the rise recently, which means different types of assembly languages play an important role in new device development. Academically, Assembly Language Programming is considered as a demanding course in Computer Science. Therefore, I hope this article could serve as problem-solving examples for students and perhaps, developers as well.

References

History

February 26, 2016 -- Original version posted