|
|||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||
|
Announcements
Chapters
Services
Feature Zones
|
IntroductionI first encountered the data: protocol when I saw the JavaScript Draw site, an AJAX implementation of a scribble application. The problem with the site was that it did not work with Internet Explorer. Going through the source code, I found that among the various reasons it did not work was that it used data: URLs as the source for dynamically created images, which is not supported for Internet Explorer. The data: protocol is described in RFC 2397. Currently, the only browsers that support the data: protocol are Opera and Mozilla Firefox. This article describes an asynchronous pluggable protocol implementation to support the data: protocol in Internet Explorer. One possible use of the data: protocol is to embed small images in the HTML itself to avoid server hits. It can also be useful in AJAX applications like JavaScript Draw to return images, encoded in base64, as the response text. The data: URL formatThe protocol itself, as described in the RFC, is quite simple. The format is: dataurl := "data:" [ mediatype ] [ ";base64" ] "," data
mediatype := [ type "/" subtype ] *( ";" parameter )
data := *urlchar
parameter := attribute "=" value
The media type indicates the type of the data and its encoding. The default media type is The next sections describe how the protocol was implemented. Parsing the URLATL's regular expression classes come in pretty handy to parse the data: URLs. The following is a regular expression to parse the URL: data:{(.*?/.*?)}?(;{.*?}={.*?})?{;(base64)?}?,{.*}
The regular expression captures the various portions of the URL into five different groups:
After capturing the different portions of the URL, the Base64 encoded data is converted into bytes. The ATL function int nReqLen = Base64DecodeGetRequiredLength(strData.GetLength()); m_pvData = new BYTE[nReqLen]; int nDestLen = nReqLen; bRet = Base64Decode(strData, strData.GetLength(), m_pvData, &nDestLen) != 0; m_dwDataLength = nDestLen; Converting the Text Data to UnicodeIf the data format is text, the text is converted into Unicode so that Internet Explorer can handle it correctly. The encoding of the source data comes from the charset attribute specified in the parameter portion of the media type. An example of such a URL is data:text/plain;charset=iso-8859-8-i,%f9%ec%e5%ed - which is some Hebrew text encoded in ISO-8859-8-i. To convert the multi byte to Unicode, we have to use the famous CComPtr<IMultiLanguage2> spMLang; if (SUCCEEDED(hr = spMLang.CoCreateInstance(CLSID_CMultiLanguage))) { MIMECSETINFO mi; if (SUCCEEDED(hr = spMLang->GetCharsetInfo(CComBSTR(GetCharset()), &mi))) ... } The typedef struct tagMIMECSETINFO { UINT uiCodePage; UINT uiInternetEncoding; WCHAR wszCharset[MAX_MIMECSET_NAME]; } MIMECSETINFO, *PMIMECSETINFO; From the first glance, it seems that the int nSrcLen = strData.GetLength(); UINT uCodePage = mi.uiInternetEncoding; int nWideChar = MultiByteToWideChar(uCodePage, 0, (LPCSTR)strData, nSrcLen, NULL, 0); if (nWideChar == 0) { uCodePage = mi.uiCodePage; nWideChar = MultiByteToWideChar(uCodePage, 0, (LPCSTR)strData, nSrcLen, NULL, 0); } if (nWideChar != 0) { WCHAR* sz = new WCHAR[nWideChar + 1]; MultiByteToWideChar(uCodePage, 0, (LPCSTR)strData, nSrcLen, sz + 1, nWideChar); m_pvData = (BYTE*)sz; m_dwDataLength = (nWideChar + 1) * 2; //If data is in Unicode it should have unicode lead bytes m_pvData[0] = 0xFF; m_pvData[1] = 0xFE; } Once the characters are converted to a Unicode stream of bytes, the byte stream needs to be prefixed with the Unicode lead bytes to indicate to Internet Explorer. The lead bytes are 0xFFFE. The URL parsing gave us the data and the MIME type of the data. The actual implementation of the pluggable protocol is pretty simple. Implementing the Asynchronous Pluggable Protocol HandlerAn asynchronous pluggable protocol handler is a COM object that implements the HKCR
{ ...
NoRemove PROTOCOLS
{
NoRemove Handler
{
ForceRemove data = s 'data: pluggable protocol'
{
val CLSID = s '{C79BF22F-25C4-4D3D-8183-14149EAB9C0C}'
}
}
}
}
The only interesting methods in the implementation of the pluggable protocol handler are STDMETHODIMP CDataPluggableProtocol::Start(
LPCWSTR szUrl,
IInternetProtocolSink *pIProtSink,
IInternetBindInfo *pIBindInfo,
DWORD grfSTI,
DWORD dwReserved)
{
HRESULT hr = S_OK;
if (m_url.Parse(szUrl))
{
m_dwPos = 0;
CAtlString strData = m_url.GetDataString();
pIProtSink->ReportProgress(BINDSTATUS_FINDINGRESOURCE, strData);
pIProtSink->ReportProgress(BINDSTATUS_CONNECTING, strData);
pIProtSink->ReportProgress(BINDSTATUS_SENDINGREQUEST, strData);
pIProtSink->ReportProgress(BINDSTATUS_VERIFIEDMIMETYPEAVAILABLE,
m_url.GetMimeType());
pIProtSink->ReportData(BSCF_FIRSTDATANOTIFICATION, 0,
m_url.GetDataLength());
pIProtSink->ReportData(BSCF_LASTDATANOTIFICATION |
BSCF_DATAFULLYAVAILABLE,
m_url.GetDataLength(),
m_url.GetDataLength());
}
else
{
if (grfSTI & PI_PARSE_URL)
hr = S_FALSE;
}
return hr;
}
The function parses the URL which automatically extracts the data. The code then sends a series of notifications to the caller. The important call is Testing the Protocol HandlerThe protocol handler is automatically registered when the project is built. Once the handler is registered, data: URLs will start working in Internet Explorer. The protocol handler has been tested with the data: URL Tests at the mozilla.com testing website. The handler passes all the tests, except one. The test fails because of the limitation of Internet Explorer URL length. So far, no security issues have been identified. I welcome readers to indicate any possible security issues with the protocol handler. History
|
||||||||||||||||||||||||||||||||||