Introduction
Currently, I am working on writing an open source SIP stack and TAPI interface based on RFC 3261. Although the .NET framework's network classes are surprisingly complete when it comes to familiar formats like HTTP or even Gopher, formats like SIP and MailTo are not supported, forcing me to implement them on my own. In good object oriented style, Microsoft allows you to modify existing formats, or even create your own, so adding support for a given scheme should be trivial, right? Well, Microsoft hasn't gotten around to documenting much of the UriParser classes. After hours of experimentation and some help from Jason Kemp, I have figured out exactly how to extend and register a UriParser, allowing your program to understand URIs from any scheme. This article will show you how to write your own UriParser in an attempt to fill the void of documentation.
Extending UriParser
UriParser is an abstract class that provides some methods for parsing a URI. Some callbacks are included also: whenever a Uri is created, all registered UriParsers are notified, for example. If the URI that you need to parse closely resembles a scheme that is already supported, it may benefit you to extend that UriParser. For most purposes, extending GenericUriParser is the best choice, because the constructor allows you to choose certain options regarding how things are parsed. Here is a skeleton class that explains the most important methods you may need to override:
public class SipStyleUriParser : GenericUriParser
{
public SipStyleUriParser(GenericUriParserOptions options)
: base(options) { }
protected override void InitializeAndValidate(Uri uri,
out UriFormatException parsingError)
{
}
protected override void OnRegister(string schemeName,
int defaultPort)
{
}
public static new bool Equals(Object objA, Object objB)
{
}
protected override bool IsWellFormedOriginalString(Uri uri)
{
}
protected override UriParser OnNewUri()
{
}
protected override void OnRegister(string schemeName,
int defaultPort)
{
}
protected override string GetComponents(Uri uri,
UriComponents components, UriFormat format)
{
}
}
Parsing a URI with GetComponents
The first thing that you might want to do is set up some Regex designed to parse out the different parts of your URI. If you use code snippets to set up a switch statement on the components parameter, you will be given a complete set of all the members of UriComponents.
protected override string GetComponents(Uri uri,
UriComponents components, UriFormat format)
{
switch (components)
{
case UriComponents.UserInfo:
case UriComponents.Port:
}
}
All you need to do is apply the correct Regex in each case and return the value. Microsoft leaves out a few possibilities though. The first two are UriComponents.Path | UriComponents.KeepDelimiter and UriComponents.Query | UriComponents.KeepDelimiter (you can get rid of the case for UriComponents.KeepDelimiter on its own, it's just an option switch and shouldn't return anything). They return the path or query, respectively, with the leading delimiter intact (surprise). In SIP, you don't have queries or paths, so I made the Path component return the SIP parameters and the Query component return the headers, because the syntax for SIP headers is identical to HTTP queries. Adjustments like this may need to be made for your URI scheme. If you have any doubts, instantiate a new URI with a Google query. Run your program in debug mode, and step through the code to see what components are required when you access each property in Uri. Knowing what flags make up each components case will help you use GetComponents calls to reuse some parsing code. It also gives you a good idea of what you should be returning in each case.
Registering your UriParser
I mentioned earlier that you need to register your UriParser before you can start instantiating Uris that require it. This associates the scheme string (i.e., "sip", "sips", "http") with a default port. Keep in mind that the scheme string must be present and greater than one character in length, and the port field must either be -1 or an integer exclusively between 0 and 65535. Here is some code to show you the right way to do it, and some ways that will fail:
UriParser.Register(new SipStyleUriParser(), "sip", 5060);
UriParser.Register(new PresStyleUriParser(), "pres", -1);
UriParser.Register(new CustomHttpStyleUriParser(), "http", 80);
SipStyleUriParser s = new SipStyleUriParser();
UriParser.Register(s, "sip", 5060);
UriParser.Register(s, "sips", 5061);
Examples
I have included the source for my SipStyleUriParser with this article. It is fully RFC 3261 compliant, and even follows the rules for URI comparison. I have also included an easy way to parse headers and parameters into a Dictionary so that they may easily be checked against each other regardless of order, and so that the values can be retrieved by the parameter name. It successfully completes all the test cases given by the specifications. You are welcome to use it in your own applications, and please let me know if you have any suggestions.
Conclusion
Despite MSDN's lack of documentation on the subject, writing your own UriParser is not very difficult. As long as you have a complete specification to work with, the implementation becomes fairly straightforward. Using this in combination with extensions of WebRequest and WebResponse will enable you to write a complete network stack! If you have any questions, comments, or suggestions, feel free to email me at augsod@gmail.com.