This article is about resolving symbolic references in a
codeDOM. Sources are included. This is Part 7 of a series on codeDOMs, but
it may be useful for anyone who wishes to resolve symbolic references in C#
source code. In the previous parts, I’ve
discussed “CodeDOMs”, provided a C# codeDOM, a WPF UI and IDE, a C# parser,
solution/project codeDOM classes, and covered loading type metadata.
Resolving a CodeDOM Tree
In the previous articles of this series, I’ve built up a C#
codeDOM that can parse itself from existing C# source files, resulting in a
tree of code objects with most symbolic references unresolved and represented
objects. In the last article, I added
support for loading type metadata from referenced assemblies, and now it’s time
to add the necessary logic to replace those
UnresolvedRef objects with
specific references using the classes shown in the table below.
| CodeDOM Class
| Base class for all symbolic references.
| Base class for all type references (supports generic parameters, arrays).
<span> </span><span> </span>UnresolvedRef
| Unresolved reference.
| Unresolved ‘this’ for an explicit interface implementation of an indexer.
| References a type declaration.
| References an alias (to a type or namespace).
<span> </span><span> </span>TypeParameterRef
| References a type parameter.
| References a method.
| References an anonymous method.
| References a constructor.
| References an operator declaration.
| Base class for ‘goto’ target references.
| References a label.
| References a switch item (case or default).
| Base class for current object instance references.
| References the base class of the current object instance.
| References the current object instance.
| Base class for all variable references.
| References a property.
| References an indexer.
| References an event.
| References an enum member.
| References a field.
| References a local variable.
| References a parameter.
| References an extern alias.
| References a namespace.
| References a compiler directive symbol.
We need to traverse the codeDOM tree from the top-down looking
objects and then attempt to resolve each one into the appropriate specific
reference object to the correct declaration by following the various scoping
and resolution rules dictated in the C# language specification. One of the things that becomes apparent when
thinking about how to resolve references is that often the type of the
reference can be determined based upon the context. For example, references after a ‘using’
directive should represent namespaces, and variable types and method return
types should represent types (with optional namespace prefixes), etc. The enum
the possible categories of reference types based on the current context.
The virtual method “
resolveCategory, ResolveFlags flags)
” on the
is used to perform the top-down traversal of the tree, and is overloaded as
necessary by all code objects to perform any special logic and to resolve all
objects, the overload attempts to resolve them, and returns the appropriate new
reference object if successful, with parent objects assigning the result to
their child properties.
enum parameter is used to represent special modes, such as the 3 phases below,
resolving inside documentation comments, and “unresolve” mode (used to convert resolved
references back into
s when necessary).
overload executes in 3 phases:
- Phase 1 resolves all types
– all statements in
headers including any base lists, stopping at type bodies.
- Phase 2 resolves all type
members – definitions of methods, properties, fields, stopping at the
bodies of methods, properties, or initializers of fields.
- Phase 3 resolves all of the
code – the bodies of methods, properties, and field initializers.
Not that this is not
3 full passes, but rather more of a breadth-first rather than depth-first
traversal of the tree, and it allows for all references to be resolved in a
single pass without order-of-evaluation dependency problems (another special
case is that
needs to resolve all
Case expressions before all of the bodies in order to
handle possible forward references via “
goto case …”).
Resolving an UnresolvedRef
Resolve() overload of
on any type argument children, and then it attempts to resolve itself by
creating an instance of the
Resolver class, passing itself to the
constructor, and calling
Resolve() on it. The
Resolver class operates according to the
behaving differently when looking for specific reference types as compared to
references in expressions (which can be of almost any type). Validation of the type of a possible matching
object and the text of any error message is also based upon the category.
Resolver class contains various special-case
logic, but the primary functionality consists of calling
to look for declarations with a matching name at the current scope, and if
nothing is found it calls
ResolveRefUp() to continue searching at
higher levels of the tree, eventually stopping if nothing is found. Depending upon the resolve category, it might
stop before reaching the top of the tree if that makes sense.
When a declaration with a matching name is found, the
method is called on the
Resolver instance, which creates a
instance and then validates that the type of the matched object is valid for
the resolve category. If the match is a
method, it must then attempt to infer any omitted type arguments if the method
is generic, and go through a lot of complicated overload logic to determine if the
parameter types match the types of the supplied arguments. There are also checks to verify that the candidate object is static or not as appropriate, and that the access
specifiers allow it to be accessed in the current scope. It’s determined whether the candidate is a
“complete” match or only partial, and this in turn determines whether or not
the search will continue into other higher scopes.
If this process finds a single valid match,
creates a new reference to the matched declaration by calling
on it, and returns it, causing the
UnresolvedRef object to be replaced with
it. If no matches are found, or if
multiple matches are found, error messages are generated as appropriate and
attached to the
object (and they get propagated up to the
Solution level and logged to
the console or displayed in the UI).
Expression Type Evaluation
In order to determine the proper match for an overloaded
method, it’s necessary to evaluate the types of the argument expressions to see
if they match the parameter types. A
TypeRefBase EvaluateType()” method has been added to
Expression, and is
overloaded as necessary by subclasses to evaluate their type. Also, a virtual “
” method has been added to
to evaluate the types of any generic type arguments on a type or method reference.
uses these to evaluate the type of each argument expression passed to a method,
and then calls
to determine if the type matches the parameter type (which internally calls
Various other methods necessary to the type evaluation
process include the following members of the
FindTypeArgument() used in the evaluation of
to handle implicit conversions, and
GetCommonType() to determine a common
type that can represent two given types.
Sometimes the name of an overloaded method is used by itself
in code, without any parentheses or parameters.
This is known as a “method group”, and it is usually assigned to a
variable of delegate type or passed to a parameter of delegate type. Such method groups are represented by the
class, which will have multiple match candidates in such a case. The method group is then normally resolved to
a single method reference using the delegate type to which it is assigned (or
passed) to determine the parameter types and thus the single matching method.
In some cases, C# source files are generated at compile time
with partial classes that must be combined by the compiler with “code-behind”
files. These files have extensions such
as “.Designer.cs” or “.g.cs”, and may be located in the output directory as
temporary files. Now that we are
resolving symbolic references, we need to also load and process these generated
files or we’ll have many symbols which can’t be resolved. This is done by logic in the
class detecting and including such files in the project. Logic has also been added to ignore automatic
code cleanup and save attempts for such files.
Nova does not yet automatically
generate these files if they are missing (like VS does), so if a project hasn’t
been built for a particular configuration, the generated files will show as
missing from the project and resolve errors will be generated for symbolic
references to declarations in them.
Code Inside Documentation Comments
Code inside documentation comments – inside a
tag, or a ‘
’ attribute – is automatically parsed and resolved by
However, any parse or resolve
errors that occur in such code will be treated as only warnings.
Parsing of content inside
tags can be turned off by setting the
false, and if it’s not parsed then there won’t be anything to resolve, either.
Nova Studio Improvements
Nova Studio now resolves all symbolic references
automatically upon loading solutions or projects. So, missing references or other such issues
can now cause a lot of error messages. A
“Go To Declaration” option has been added to the context menu to navigate to
the target of a symbolic reference.
Also, any expression which evaluates to a constant value now displays the
constant value in its tooltip. The screenshot below shows that references are resolved.
The Nova Studio IDE now has similar functionality to the VS
IDE as far as loading solutions, projects, files, and referenced assemblies
into memory and also parsing and resolving all sources and displaying error messages. Just out of curiosity, here’s a performance
|| Code Objects
|| Load (secs)
|| Memory (MB)
| SubText 2.5.2
| Mono Tests
<span style="color: red;">4,000+</span>
<span style="color: red;">3,000+</span>
| MS EntLib Tests
| Large Proprietary
| SharpDevelop 4.2
The numbers shown are approximate, using Task Manager to
check peak working sets while loading and the approximate time until the UI is
highly responsive (CPU usage less than 15%).
Nova isn’t optimized yet, especially not the resolving, but it seems to
be performing up to 3-4 times faster than VS.
I’ve noticed over the years that VS (or perhaps MSBuild, which it uses
to load) seems to be doing something for each project that takes about a half
to a full second, and so makes loading large numbers of projects painfully slow. Despite some improvements over the years, it’s
always been and still is slow if you have dozens of projects, much less a
hundred or so. The test case shown of a
solution with almost 2,000 test projects from Mono is laughable – I gave up
after well over ONE HOUR of waiting and with the working set over 3 GB! That seems to indicate that the performance
has an exponential relationship to the number of projects. Nova does all the work that needs to be done
in less than 10 seconds. I know this is
not a typical use case, but it’s still very sad… because it probably wouldn’t
be that hard to fix this, and if they did then VS would be much snappier
loading more typical solutions. I think millions
of users out there loading solutions in say 1/2 the time would be well worth
the effort of a few man-months of dev work to clean up the loading process. Hey, MS, anyone listening?
Using the Attached Source Code
Resolving folder has been added with new
classes related to the resolving support:
collection). It also contains the
enums. Various resolving-related methods
– such as
and others – have been added to most of the existing CodeDOM classes,
segregated into regions with a comment of “RESOLVING”. New examples have been added to the
Nova.Examples project, including demonstration of the
flag to load solutions/projects without resolving if it’s not required. Nova Studio now resolves by default, and all
of the red unresolved references should be gone (other than for post-C# 2.0
features, which I’ve implemented but not yet released as open source). Use it to load the provided source code
(“Nova.sln”) and inspect the source files to see. As usual, a separate ZIP file containing
binaries is provided so that you can run them without having to build them
My codeDOM now has support for loading, parsing, and
resolving C# solutions and projects.
Nova Studio is starting to look like a real IDE! Now that entire solutions can be loaded and
resolved, it’s time to add some basic features for analyzing the code. Everything up to this point may have been
interesting, but now it’s time to actually do something useful! In my next article, I’ll look at calculating metrics
on a codeDOM and also doing various types of searches on it.