There are various tools available to monitor HTTP traffic that is being sent and received from different processes. Fiddler is one such good example. All these programs open a port and filter HTTP traffic based on process id. But if a C# app consists of multiple browsers, they fail to identify which request was sent by which browser.
The C# browser control only provides Navigating and Navigated events and does not give any idea about the requests that it sends (e.g. loading of images, etc.).
This article provides an ATL COM DLL that can monitor HTTP traffic from individual browsers.
While working on a project which required the same, I stumbled upon PassThruApp by Igor Tandetnik.
csExWBDLMan COM library" from The most complete C# Webbrowser wrapper control) is one implementation of PassThru App that provides the requests, but does not provide detailed information of redirections, data received cookies, etc. in the requests. So, I decided to write a custom code just for monitoring HTTP traffic based on both PassThru App and csExWBDLMan.dll.
About PassThru App
"It is an object that implements both sides of URL moniker-to-APP communication, that is, it implements both
IInternetBindInfo. We register it as a temporary handler for a standard protocol, such as HTTP. Now whenever an HTTP request needs to be sent, URL moniker will create an instance of our pAPP and ask it to do the job. The pAPP then creates an instance of a standard APP for the protocol in question (I call it a target APP, or tAPP...) and acts as its client. At this point, our pAPP becomes a proverbial man-in-the-middle. In the simplest case, any method call made by URL Moniker on pAPP is forwarded to tAPP, and any method call made by tAPP on pAPP is forwarded back to URL Moniker. The pAPP gets to observe, and if desired modify, every bit of information relevant to this request passing back and forth between the moniker and the tAPP. QED" - Igor Tandetnik
The code extends classes provided by PassThru App. There are two main classes:
PassthroughAPP::CInternetProtocolSinkWithSP that implements
PassthroughAPP::CInternetProtocol that implements
class MonitorSink :
class CTestAPP :
Now we can intercept requests using:
- Request -
- Response -
- Redirection, Cookies Transferred, Error, Cache Loaded, etc.-
BINDSTATUS_REDIRECTING etc. depending on the information required.
- Data received -
But the problem is that we are using Asynchronous Pluggable Protocol and all the requests are done asynchronously. So we get all information, but cannot say which response belonged to which request. Moreover, the data is received asynchronously in chunks.
The best solution is that if we get unique id for a transaction (i.e. unique id attached request, response and data received), then we will be able to weave the async calls back together. Here we get lucky.
IInternetBindInfo for a request is most of the times unique and is available in all the methods. But sometimes, it is reused by the Browser.
IHTMLWindow2 exists and is unique in case request is sent from iframe.
Url to which request is made is also most likely to be unique.
So if we create an id from all of them, we will get a unique id for a transaction. Now we do not limit to this. If there are multiple iframes present on the page, then we can traverse the page and tell which iframe sent which request based on referrer. In addition, each iframe fires
NavigationComplete events on navigating. Objects of the iframes are available from
InternetExplorer interface. If you link all the available information, you can draw a complete picture of:
- What is the hierarchy of the iframes on the page
- Which iframe navigated to which URLs
- What were the requests sent by a particular iframe
- What URLs failed to load
- Which URLs used files from the local computer (with their actual file location)
- What headers, cookies were sent and received
- How much time did each request take, etc.
- If you set
Silent=true in native COM control or
ScriptErrorsSuppressed=true in .NET
WebBrowser, the above code stops working.
- The assumption behind filtering requests based on iframes is that every iframe on a page has a different location.
- One more assumption is that a flash from a particular URL is loaded only in one webrowser control (if many controls are present) and only in one page. Otherwise the requests sent from flash control will get mixed up. The reason is that requests sent by flash show referer as flash object instead of the page URL and we cannot determine which flash (from which control) actually sent the request.
Using the Code
When you attach your browser with HttpMonitor.dll, on each request, response, etc. an event is fired with all the required arguments. There are twelve plus one events available:
OnRequest(int id, int containerId, string url,
string headers, string method, object postData)
OnRedirect(int id, int containerId, int redirectedId, string url,
string redirectedUrl, string responseHeaders, string requestHeaders)
OnResponse(int id, int containerId, string url, int responseCode, string headers)
OnDataRecieved(int id, int containerId, string url, object data, bool isComplete)
OnCookieSent(int id, int containerId, string url, string cookies)
OnCookieRecieved(int id, int containerId, string url, string cookies)
OnMimeTypeAvailable(int id, int containerId, string url, string type)
OnCacheLoaded(int id, int containerId, string url, string location)
OnP3PHeaderRecieved(int id, int containerId, string url, string p3PHeader)
OnError(int id, int containerId, string url, int result, int errorCode)
OnProgress(int id, int containerId, string url,
int grfBSCF, uint progress, uint progressMax)
GetIServiceProviderOnStart(int id, int containerId, string url, int ptr)
One more event is available:
ConfirmRequest(int id, int containerId, string url,
int totalInstances, ref bool itsMine)
If request is of flash object or sent from flash object, then the DLL is not able to determine which browser sent it. It fires the above event and asks you to look into your request logs and suggest whether it belongs to your browser or not based on container id. Sample implementation is provided in demo app. If you set
itsMine=true by default, you can trace all the requests made by the current process.
One common mistake is to think
CHttpMon is used only for one transaction at a time. If CHttpMon contains a private variable, it will be shared in diff requests and we cannot store data for any one request in it. Also referer for the requests made by flash object is location of the flash object instead of the page. So we need to keep track of these flash objects. Anyways all these objects behave just like Microsoft intended and not like how we want them to be. Most of this is undocumented and we need to do A/B test to find out how they actually work.
IHTMLWindow2 is available for requests made by iframes. This can also be exploited further (i.e. events like new
OnPageRequest). I was hoping that I will receive this in all requests and create
IHtmlWindow2 and referer that will be totally unique but it looks like that is not possible :(
You will first need to navigate to about:blank so that "
Internet Explorer_Server" window is available. Then, attach the browser using handle of "
Internet Explorer_Server window".
if (monitor == null)
monitor = new HttpMonitorLib.HttpMonClass();
monitor.IEWindow = GetTopWindow(GetTopWindow
monitor.OnRequest += new HttpMonitorLib._IHttpMonEvents_OnRequestEventHandler
For example, the following function will be executed whenever the browser sends a request.
private void monitor_OnRequest(int id, int containerId, string url,
string headers, string method, object postData)
The id specifies unique id associated with that particular HTTP transaction and
containerId is the
uniqueId of the
iframe sending the request. Basically, it is hash of the
iframe's current location.
- All request headers were not reported in
OnRequest event in the last version as all the headers are only available after
- Modified demo app implementation so that request of an
iframe belongs to its page instance instead of its parent's page instance.
- Code refactoring/optimization in both demo app and httpmonitor DLL.
The demo app provides the implementation of
HttpMonitor to detect complete page navigations. The main class page accepts "
Internet Explorer_Server" as parameter. It can be considered wrapper of
InternetExplorer interface. The class has the following properties:
Children - List of children pages
Entries - List of requests/response sent by this page/iframe
Navigations - Navigations done by this page or iframe
AllIEW<code>ithNavigations - All instances of
InternetExplorer interfaces with their navigations
Webrowser - Actual instance of
InternetExplorer interface of current page
AllEntries - All requests/response sent by current control
AllPages - Instances of all Page objects
The reason I put
containerId as integer instead of string (as
containerId is hash of referral URL) is that I thought that somehow I will be able to get
IHTMLWindow2 pointer and I will pass this as
containerId. So far, I have been unsuccessful. If anybody is able to figure out how to do this, please do tell me.
There are crude ways to achieve this, for e.g., in
BeforeNavigate2 event, set
Cancel=true and renavigate to
url + "IHTMLWindow2=<pointer to IHTMLWindow2>" in query string
or navigate with
"IHTMLWindow2:<pointer to IHTMLWindow2>" in headers
and parse requests to get the value. But firstly, this breaks the regular navigation and secondly flash object's requests again do not persist these values.
I want to thank Igor Tandetnik for his wonderful PassThruApp that made all this possible.