Click here to Skip to main content
Email Password   helpLost your password?

Screenshot

Introduction

Asmex is a viewer for the internals of .NET assembly files. While the world is not particularly short of .NET assembly viewers, Asmex has some unique features and the source might prove useful in various contexts. Asmex's features include:

Rationale

Asmex was an educational project; the idea was to make an application that involved knowledge of the very lowest possible level of .NET, yet also took advantage of the clean GUI model of WinForms. It is used in our company for training and debugging purposes.

In terms of low-level .NET, Asmex contains code to read raw metadata tables and such-like. I was generally impressed by the efficient and ingenious way .NET metadata is stored.

The elegance (relative to MFC) of the WinForms model is demonstrated by fitting the heterogeneous data obtained by reflection and binary file parsing into a common tree format for display. Again, I was impressed by how much less work this was than the MFC equivalent. A generic object properties viewer (taken from another project) is also shoehorned into Asmex -- it uses .NET's interesting Attribute functionality to provide a properties list for each item in the tree.

Asmex was not intended to win prizes for canonically correct design, and that is why the data is held in classes derived from the GUI tree node. Sorry :)

This article will discuss the (hopefully) more reusable and interesting areas of the Asmex source code.

PE File Reader / .NET Metadata Reader

The FileViewer namespace contains the most useful part of ASMEX. A very short and vague description of the structures dealt with in FileViewer follows. If you are an expert on Microsoft executable formats, you will want to skip it. Otherwise, you might find it too vague, in which case you can either look at Asmex's source or ask me about it. I love talking about file formats. Incidentally, if there is any demand for a brief overview of PE/.NET file formats and .NET type, resource and metadata concepts, I would love an excuse to write one.

Background -- PE Files

Almost every Windows executable, DLL or EXE, is a Portable Executable (PE) format file. Although there is little in the PE format that lends itself to .NET, in the current implementation of .NET all assemblies are contained in special PE format files, which have some traditional bits left out and quite a lot of new bits put in.

Very generally, a PE file consists of a PE header, which contains a list of Data Directory entries, and a number of Sections which are defined just after the PE header. Not all the Data Directories have meaning in a .NET file, and not many Sections are present either. Nevertheless, those that remain are still important -- in particular, the last Data Directory entry points to the start of .NET information.

Background -- .NET PE Files

The real starting point of a PE file, from the .NET point of view, is the COR20 Header, which tells the .NET runtime where to find the metadata. The COR20 header, like the PE header, specifies some Data Directories, as well as the entry point for the assembly. Most of these Data Directories point to things like fixup information which is not useful for examining the assembly, but one of them points to the start of the Metadata Streams.

Background -- Metadata streams

.NET holds metadata in streams (usually four of them). Each of these streams has a different format:

Background -- Metadata tables

Metadata tables are just regions of data, lying end-to-end inside the file. There is a fixed, known number of tables, and each table has a fixed, known range of tables that it's tokens (see below) can refer to. Tables do not actually contain things like strings, method signatures, etc.; rather, they contain either:

In general, the structure of the tables is such that you must know the properties of the particular column you are looking at in order to interpret the numbers found in it. This leads to a remarkably small data size, considering how rich .NET metadata is. (It's a pity that this is then stuck into the not-very-efficient PE format). Asmex unpacks the tables, looks up the strings etc for each row, and presents them in a relatively friendly format. Todo: UCS-2 strings are shown in hex form.

Background -- Types

There are two types which it is very important to understand when looking at .NET files at a binary level:

The Classes

Generally, each class in FileViewer represents some chunk of the information described above. Where possible, each class describes an actual physical range of bytes in the file, and is therefore inherited from Region, which is an abstract base class with 'start' and 'length' properties. Even though the information about where a given structure is physically located is not that useful in Asmex's treeview, we keep track of it in case we ever want to create a visual PE file examiner or a PE file emitter.

There are also some classes that do not represent a particular range of bytes, but encapsulate other information; these include the Metadata table related classes Table and TableCell, and also the classes related to PE import and relocation tables.

Each class takes a BinaryReader in its constructor. This reader is assumed to refer to the assembly file and to be 'wound' to the right offset. In some cases, it was necessary to adjust the reader's offset by hand, because some arithmetic is required in converting RVAs and so on.

These classes should serve as documentation for a wide range of PE and .NET structures. For comprehensive documentation, please see the Bibliography below.

Reflection Tree

A simple system for representing hierarchical data obtained from the PE file parser or by reflection. Each item is represented by a BaseNode-derived class, which holds a reference to a data object. Each node then GenerateChildren method to populate the items below it, creating new data objects of various types as necessary. The logic for viewing .NET types by reflection is contained in these tree node classes. This logic is not very complicated and has been described in many places already, so there's no need to go through it here.

It is easy to add new data items to Asmex by deriving a new node class, and modifying the GenerateChildren method of another node so that your new node is sometimes generated. You can also override your node's GetMenu method to add context menu operations for that node type.

This design is not a work of genius, but it does the job of presenting the data in a unified way and generating nodes only on demand. In MFC it would probably have been necessary to build a large tree infrastructure and connect it to a CTreeCtrl by some horrible tangle of messages.

Property Viewer

The ObjViewer namespace contains a few classes that define a generic property-viewer control. ObjViewer is a UserControl that presents a list of name-value pairs for any given object. Of course, the properties available on an object don't necessarily add up to a freindly view of the object, so you can use the ObjViewerAttribute attribute to modify the properties of a target object -- for instance, to specify that a property be shown in hex or not shown at all.

GAC Browser

The GACPicker class allows the user to select an assembly from the Global Assembly Cache. It does this by looking at the filesystem representation of the GAC, since there appears to be no actual API in the current .NET environment.

Ridiculous Star-Wars Writing

The HintDlg class presents hints in preposterous Star-Wars style perspective scrolling text. It uses GraphicsPath.Warp to apply a pseudo-perspective transformation to the text. Annoying, but I felt it had to be done.

Bibliography

For PE/.NET file format information, I would suggest reading sections 21-24 of ECMA-335 Partition II, available all over the web. Inside Microsoft .NET IL Assembler is also a good book, despite the occasional inaccuracy.

If you want to go further and understand the actual CIL instructions in your assembly, Compiling for the .NET Common Language Runtime is an excellent book.

If you want to examine your binary files in comfort, may I humbly plug my own AXE program.

You must Sign In to use this message board.
 
 
Per page   
 FirstPrevNext
GeneralCan't see entire assembly
infal
23:02 22 Apr '08  
Hi Ben!

First of all many thanks for this great work! It helps me a lot to find some problems in resources.

But now I've one .NET Dll for which the ASMEX don't show the reflection tree at all! I see "Headers",
"Imprts/Exports", "Relocations", "Heaps" and "Tables", but not the first entry (with subtree) marked
by small blue book-symbol.

The DLL compiled the same way as all other DLL's, and other tools can show the reflection tree. It's possible
to send the DLL to You to examine, where is problem with the viwer?

One more question, is there some way to extract/find embedded manifest resource (i.e. linked XML-File) and probably "Copyright string in unmanaged block" without loadig assembly for reflection, just over "Tables"->"Manifest Resource"->what is the next step?... and to examine if the are some assemblies in DLL at all?

Many thanks in advice!
Alexander
GeneralVery Helpful Thanks
Ennis Ray Lynch, Jr.
5:46 9 Nov '07  
!


Need a C# Consultant? I'm available.

Happiness in intelligent people is the rarest thing I know. -- Ernest Hemingway

GeneralMany GAC assemblies not showing
ToolmakerSteve2
8:37 16 Dec '06  
My C:\Windows\assembly folder has 581 items in it.
Assembly Viewer / Select Assembly from GAC lists ~ 100 choices.

Most importantly, the Ajax Control Toolkit I dragged in yesterday, and am referencing from an ASP.NET website under development, does not show.

I can see it in my assembly folder in Windows Explorer, and can right-click, examine its properties. My @ Register directive in my code sees it.
Assembly Viewer / Select File is able to view the dll (when selected from its original download location, before I dragged it into GAC).

What is Assembly Viewer depending on, when it lists GAC items?

GeneralRe: Many GAC assemblies not showing
ToolmakerSteve2
9:32 16 Dec '06  
Found the problem. To fix:

GACPicker.cs / GACPicker_Load()
replace:
root += "\\Assembly\\GAC";
with:
root += "\\Assembly";

Explanation: There are other directories in \Windows\Assembly, such as GAC_32. ANY directory in \Windows\Assembly is a valid source of GAC items.

GeneralLookup by hex token value
bvoigt
15:48 12 Jun '06  
I did a little Reflection.Emit, and PEVerify is giving me a whole bunch of errors looking like:

[MD]: Error: Method has a duplicate, token=0x06001314. [token:0x06001300]
[MD]: Error: Method has a duplicate, token=0x06001316. [token:0x06001301]
[MD]: Error: Method has a duplicate, token=0x0600130f. [token:0x06001302]
[MD]: Error: Method has a duplicate, token=0x0600130e. [token:0x06001304]
[MD]: Error: Method has a duplicate, token=0x06001304. [token:0x0600130E]
[MD]: Error: Method has a duplicate, token=0x06001302. [token:0x0600130F]
[MD]: Error: Method has a duplicate, token=0x06001300. [token:0x06001314]
[MD]: Error: Method has a duplicate, token=0x06001301. [token:0x06001316]

It looks like Asmex could help me troubleshoot this, however the method table doesn't display the tokens I guess, and I end up looking through each class, also I don't know if Asmex shows the duplicates. Any ideas?
QuestionFile/MethodList Indexes
DigitalBay
20:47 21 Dec '05  
When I drill down to
MMModule >> MDTables >> TypeDef Table

There are three columns (MethodList, FieldList, Extends) that return a cooked value like "Field 04000001", or a raw value like "04000001".

What do these mean? Is it a byte value in a string? If so, how do I convert it to the proper byte to get the correct position?

Thanks
AnswerRe: File/MethodList Indexes
DigitalBay
21:02 26 Dec '05  
Sorry, just re-read your article - your answer was there all the time - in the MDTables class.
GeneralWHAT THE F#$@?
csmac3144
20:12 7 Dec '05  
Did it ever occur to you that people might want to, you know, COPY AND FRICKIN' PASTE the information you display about the assembly?

This thing is so frustrating it's beyond words.

Copy and paste. It's an old but useful concept, Dude!
GeneralRe: WHAT THE F#$@?
Dylan Morley
0:10 22 Dec '05  
Erm, he's given you the source code - want something extra from it, write it yourself!!


GeneralRe: WHAT THE F#$@?
ToolmakerSteve2
9:35 16 Dec '06  
This was freely given to us all, appreciate the contribution, add to the contribution, or shut up and go away.
GeneralProtecting .Net application from piracy
Fad B
6:12 7 May '04  
Hello
Many thanks for this great sample...
I'm working on PE since one year, BUT I'm a beginner with Pe.Net.
So I will be very glad if any one can help me in the following problem:

I need to protect my .Net Application from piracy with Dongle (Hardware key)
BUT the problem is : the code generated from VC.Net are reversible, That mean any one can return my project to the original code and change it to work without the dongle. This was hard in the old PE, BUT now it is so easy with PE.Net...

So any one have any idea how to make the executable pe.Net irreversible !

Thanks for any help
Fad

GeneralRe: Protecting .Net application from piracy
Naveen K Kohli
7:56 18 Jan '05  
Use a good obfuscator to protect decomilation. The code obfuscation does not gurantee 100% protection but it makes it hard to reverse engineer the code.

Naveen Kohli
http://www.netomatix.com
GeneralRe: Protecting .Net application from piracy
redcheek
15:32 7 Jan '08  
Using a protector to protect your assemblies.
you can try DNGuard HVM.

DNGuard HVM - Advanced .NET Code Protection and Obfuscation Technology
http://www.dnguard.net

GeneralPossible Bug in Code for CustomAttribute table
Profox Jase
23:17 30 Jul '03  
Hello,

Whilst debugging my own app, I have been using Asmex for comparison, and for learning from. As part of this process, I have discovered a problem with the CustomAttribute table - I am not sure whether I am misunderstanding Ben's code, or whether the Asmex code is incorrect or whether the ECMA documentation is incorrect. Additionally, although I can see Ben's code works, I can't figure out some of the logic...more in a moment. First the potential bug:

Asmex code:
_td[0x0C] = new Table(Types.CustomAttribute, new Types[] { Types.HasCustomAttribute, Types.CustomAttributeType, Types.Blob }, new String[] { "Type", "Parent", "Value" }, this, reader);
Ecma details:
The CustomAttribute table has the following columns:
· Parent (index into any metadata table, except the CustomAttribute table itself; more precisely, a HasCustomAttribute coded index)
· Type (index into the MethodDef or MethodRef table; more precisely, a CustomAttributeType coded index)
· Value (index into Blob heap)

By my reckoning, it seems that the coded field types of HasCustomAttribute and CustomAttributeType are in the correct order, but parent and type are the wrong way round.

Now my problem, I am inspecting two dlls - one is the Microsoft.VisualBasic.dll and the other is a very small one of my own, for test purposes. For my dll, CustomAttribute.Parent must be a 2 byte field for correct parsing of the data, this is the case for my code and for Asmex. For the VisualBasic dll, CustomAttribute.Parent is 2 bytes in my code, and 4 bytes in Asmex (although in Asmex, its actually the CustomAttribute.Type field). If I change to my code to use 4 bytes, it works and matches Asmex.

I can't figure out the rationale (and thus can't code to cope with it) behind Asmex getting it right. My interpretation of ECMA goes like this: the first byte of the field encodes the table and the remaining byte (for a two byte index) or remaining 3 bytes (for a 4 byte index) gives the row number. In fact, by my logic, the field should always be 4 bytes...yet it isn't.

My initial solution to was look at the number of rows in each of the 19 tables that are relevant, and if any of them have a row count > 64k, then it would be 4 bytes, but in both my dll and visual basic, the 19 tables are each much less than 64k.

Can anybody help in explaining the logic behind deciding between 2 and 4 bytes please? Sadly, I don't seem to be clever enough to undestand how Asmex gets it right. Cry

Many thanks to anyone who responds and puts me out of my misery!

Jason King

Struggling Author
GeneralAmendment
Ben Peterson
4:01 10 Jan '03  
Asmex builds a map of offsets into the string heap, and uses this map to resolve string references in the MD tables. This works fine for ordinary assemblies, but it is possible to make an assembly whose strings can't be resolved in this way.

The .NET specification allows a string reference to point anywhere in the string heap, not just to thestart of a string. Therefore, it is possible (although probably not very useful) to create an assembly in which some strings overlap with each other. Such an assembly can be read by Asmex if the GetByOffset method of MDHeap is overridded in the MDStringHeap class thus:



public override object GetByOffset(int i)
{
int originalKey = i;
object value = _data[i];

while (value == null && i >= 0)
{
// Locate the previous key (there's
// probably a more efficient way of doing this)
value = _data[--i];
}

if (originalKey != i)
{
// re-index into the string
int diff = originalKey - i;
string str = (string)value;
value = str.Substring(diff, str.Length - diff);
}
return value;
}


Thanks to Sami Vaaraniemi for finding this .NET feature and suggesting the fix.



URL: http://www.jbrowse.com
Favorite Toy: http://www.ruby-lang.org

GeneralCool..
David Stone
15:26 29 Nov '02  
Very nice. You may want to look at Reflector[^] to see some other cool ideas of what you could do with an app like this.


I don't know whether it's just the light but I swear the database server gives me dirty looks everytime I wander past. -Chris Maunder
Microsoft has reinvented the wheel, this time they made it round. -Peterchen on VS.NET

GeneralRe: Cool..
Anonymous
12:32 25 Mar '03  
Seems like they already did. It would be amazing if they came up with some of that stuff (e.g. icon assignment) independent.
GeneralRe: Cool..
Profox Jase
2:29 22 Jul '03  
Hiya,

Any of you guys understand relative virtual addressing that Ben seems able to cope with but I can't?

Jason King

jason.king@profox.co.uk
Feel the love at www.profox.co.uk
GeneralRe: Cool..
Ben Peterson
3:12 22 Jul '03  

Hi,

I got your email but I thought i'd reply here. An RVA is an offset relative to the address the module loaded into memory at, right? So if the module loaded at 0x10000, then an RVA of 0x100 refers to the address 0x10100.

This is complicated by the fact that PE modules are divided into sections, and different sections may be loaded into memory in different places. You have to find out what section an RVA is in, and find out the address the section was (or in this case 'will be') loaded at, to find the real offset.


In Asmex, we are dealing with a file on disk, so we are working backwards to find an offset in the file based on what the RVA would be if the app was loaded and running. That's what ModHeaders::Rva2Offset() does; it loops through the sections, looking for one that the RVA is in. When it has found that section, it subtracts the sections VirtualAddress, i.e. the address the section would have been loaded at.

There is one further complication -- the section begins at some arbitrary point within the file, and we have to factor that in to find the offset in the file that corresponds to an RVA. That's why we add the section's PointerToRawData value.

HTH

Benjamin


URL: http://www.jbrowse.com
Favorite Toy: http://www.ruby-lang.org

GeneralRe: Cool..
Profox Jase
4:05 22 Jul '03  
Hi Ben,

Thanks, you are a star. I think with this information I will understand it. My problem lies in my ignorance of very low level stuff such as this, it was a little bit over my head, but with your help and your code, I am starting to figure it out. I have determined from your RVA2Offset code what I need to do, and your explanation helps.

I have a CLI Directory that lists the RVA as 8000. To find this I need to:

1. Produce a collection of sections headers and populate their data.
2. Examine each section header and see if an RVA of 8000 lies in the range between VirtualAddress and (VirtualAddress + RawDataSize).
3. When I find the matching section, I substract the section start point from RVA, in order to get an offset from the start of section.
4. When I have my offset from the start of the section, I then look at the start point on the disk for that section (PointerToRawData) and add it to my section offset to find the place on the disk where the stuff I want lies.

Phew, that is a pretty nasty read Blush To clarify:

As an example (figures are arbitrary): CLI Directory offset = 8000
mySection: Virtual Address = 2000, SizeOfRawData = 9000, PointerToRawData = 500
8000 lies between 2000 and 11000, so I have found my section.
Offset from start of section is therefore 8000-2000 = 6000, so my data lies 6000 from whereever the section starts on the disk.
PointerToRawData = 500, so my data starts at 500 + 6000 on the disk.

Does that make sense? I think I understand, its this damn mapping between memory and disk that I don't get. Never mind.

One last point, I don't understand why you add sizeOfRawData (the size on disk) to the virtual address as opposed to adding VirtualSize(the size in memory) when you are calculating your section range.

Could you please verify my example so that I know that I am on the right tracks, and answer that last point there? I realise you are probably busy, but I need to get some code working and I need to understand it as I have to write about it.

Many thanks



Jason King

jason.king@profox.co.uk
Feel the love at www.profox.co.uk
GeneralNice Project
Gevik Babakhani
14:02 29 Nov '02  
Hi,
cool project Roll eyes


C:\>csc *.cs
Microsoft (R) Visual C# .NET Compiler
error CS2001: Source file 'brains.cs' could not be found
fatal error CS2008: No [brains.cs] specified

C:\>

GeneralRe: Nice Project
Anonymous
0:23 19 May '03  
hi


Last Updated 29 Nov 2002 | Advertise | Privacy | Terms of Use | Copyright © CodeProject, 1999-2010