Introduction
Currently, I am working on writing an open source SIP stack and TAPI interface based on RFC 3261. Although the .NET framework's network classes are surprisingly complete when it comes to familiar formats like HTTP or even Gopher, formats like SIP and MailTo are not supported, forcing me to implement them on my own. In good object oriented style, Microsoft allows you to modify existing formats, or even create your own, so adding support for a given scheme should be trivial, right? Well, Microsoft hasn't gotten around to documenting much of the UriParser
classes. After hours of experimentation and some help from Jason Kemp, I have figured out exactly how to extend and register a UriParser
, allowing your program to understand URIs from any scheme. This article will show you how to write your own UriParser
in an attempt to fill the void of documentation.
Extending UriParser
UriParser
is an abstract class that provides some methods for parsing a URI. Some callbacks are included also: whenever a Uri
is created, all registered UriParser
s are notified, for example. If the URI that you need to parse closely resembles a scheme that is already supported, it may benefit you to extend that UriParser
. For most purposes, extending GenericUriParser
is the best choice, because the constructor allows you to choose certain options regarding how things are parsed. Here is a skeleton class that explains the most important methods you may need to override:
public class SipStyleUriParser : GenericUriParser
{
public SipStyleUriParser(GenericUriParserOptions options)
: base(options) { }
protected override void InitializeAndValidate(Uri uri,
out UriFormatException parsingError)
{
}
protected override void OnRegister(string schemeName,
int defaultPort)
{
}
public static new bool Equals(Object objA, Object objB)
{
}
protected override bool IsWellFormedOriginalString(Uri uri)
{
}
protected override UriParser OnNewUri()
{
}
protected override void OnRegister(string schemeName,
int defaultPort)
{
}
protected override string GetComponents(Uri uri,
UriComponents components, UriFormat format)
{
}
}
Parsing a URI with GetComponents
The first thing that you might want to do is set up some Regex designed to parse out the different parts of your URI. If you use code snippets to set up a switch
statement on the components
parameter, you will be given a complete set of all the members of UriComponents
.
protected override string GetComponents(Uri uri,
UriComponents components, UriFormat format)
{
switch (components)
{
case UriComponents.UserInfo:
case UriComponents.Port:
}
}
All you need to do is apply the correct Regex in each case and return the value. Microsoft leaves out a few possibilities though. The first two are UriComponents.Path | UriComponents.KeepDelimiter
and UriComponents.Query | UriComponents.KeepDelimiter
(you can get rid of the case for UriComponents.KeepDelimiter
on its own, it's just an option switch and shouldn't return anything). They return the path or query, respectively, with the leading delimiter intact (surprise). In SIP, you don't have queries or paths, so I made the Path component return the SIP parameters and the Query component return the headers, because the syntax for SIP headers is identical to HTTP queries. Adjustments like this may need to be made for your URI scheme. If you have any doubts, instantiate a new URI with a Google query. Run your program in debug mode, and step through the code to see what components are required when you access each property in Uri
. Knowing what flags make up each components
case will help you use GetComponents
calls to reuse some parsing code. It also gives you a good idea of what you should be returning in each case.
Registering your UriParser
I mentioned earlier that you need to register your UriParser
before you can start instantiating Uri
s that require it. This associates the scheme string (i.e., "sip", "sips", "http") with a default port. Keep in mind that the scheme string must be present and greater than one character in length, and the port field must either be -1 or an integer exclusively between 0 and 65535. Here is some code to show you the right way to do it, and some ways that will fail:
UriParser.Register(new SipStyleUriParser(), "sip", 5060);
UriParser.Register(new PresStyleUriParser(), "pres", -1);
UriParser.Register(new CustomHttpStyleUriParser(), "http", 80);
SipStyleUriParser s = new SipStyleUriParser();
UriParser.Register(s, "sip", 5060);
UriParser.Register(s, "sips", 5061);
Examples
I have included the source for my SipStyleUriParser
with this article. It is fully RFC 3261 compliant, and even follows the rules for URI comparison. I have also included an easy way to parse headers and parameters into a Dictionary
so that they may easily be checked against each other regardless of order, and so that the values can be retrieved by the parameter name. It successfully completes all the test cases given by the specifications. You are welcome to use it in your own applications, and please let me know if you have any suggestions.
Conclusion
Despite MSDN's lack of documentation on the subject, writing your own UriParser
is not very difficult. As long as you have a complete specification to work with, the implementation becomes fairly straightforward. Using this in combination with extensions of WebRequest
and WebResponse
will enable you to write a complete network stack! If you have any questions, comments, or suggestions, feel free to email me at augsod@gmail.com.