Click here to Skip to main content
15,886,199 members
Articles / Web Development / ASP.NET
Article

Display a non-US-ASCII filename in File Download dialog box

Rate me:
Please Sign up or sign in to vote.
3.62/5 (15 votes)
14 Nov 20047 min read 225.6K   1.1K   35   26
Simple ways to display a non-US-ASCII filename in the File Download dialog box.

Contents

Introduction

Upload/download files are common tasks in an ASP.NET application. Once a user uploads a file into a web server, later on when downloading that file from the server, he would prefer to see the filename exactly displayed in the File Download dialog box. Basically, developers would normally use the “Content-Disposition” header field to force the download, and the “filename” parameter is used to suggest a filename for the downloaded file. If the filename just contains all US-ASCII characters, then there is no problem as the filename shown in the File Download dialog box is the same as when it was uploaded. The problem only happens when the filename contains non-US-ASCII characters such as Vietnamese or Arabic …, and at that time, it is corrupted and not displayed in the manner that the user would like to see. The reason that explains this problem is, the filename parameter is limited to US-ASCII. For complete information on the Content-Disposition field, you can see RFC 2183.

a non-US-Ascii filename is corrupted

Figure 1: a non-US-ASCII filename is corrupted

In this article, I come up with three simple alternative ways that can solve this issue to accurately display a non-US-ASCII filename in the File Download dialog box:

  • Encoding filename
  • URL Rewriting
  • “Encoded-word” mechanism

By the end of each section, I also explain a bit when we can use that solution.

a non-US-ASCII filename is correctly displayed

Figure 2: a non-US-ASCII filename is correctly displayed

Encoding filename

In this solution, we are going to use the html <a> element to make a link to the requested file instead of using the Content-Disposition header field to develop the download functionality. However, one thing that we need to take into account is, the browser normally sends URLs as UTF-8, so when a file is uploaded to the server, the filename needs to be encoded before saving. Below is the code snippet used to encode the filename:

C#
public static string EncodeFilename(string filename)
{
    UTF8Encoding utf8 = new UTF8Encoding();
    byte[] bytes = utf8.GetBytes(filename);
    char[] chars = new char<bytes.Length>;
    for(int index=0; index<bytes.Length; index++)
    {
        chars[index] = Convert.ToChar(bytes[index]);
    }

    string s = new string(chars);
    return s;
}

This solution is simple and we can use it when the File Download dialog box is not forced to display. However, we also need to provide a bit more work to control the filename duplication as the users of the application might upload a lot of files that have the same name. In addition, this solution does not work properly in some cases when the encoded values contain some special characters that are not allowed in naming a file.

URL Rewriting

As I said above, developers would normally use the “Content-Disposition” header field to force the download. And when we look at the way the Mail User Agent (MUA) processes the Content-Disposition header field, we can see that the receiving MUA uses the filename parameter value as a basis for the actual filename in the File Download. If this parameter is absent, the MUA is likely to display the name of the web page that is responsible for writing out the downloaded file contents to the client (in an ASP.NET application, it is normally the name of an aspx page). So, the idea in this solution is that we first make the request for the downloaded file, and then when it arrives at the server, the URL will be rewritten to an aspx page that is in charge of reading the original requested file and sending back to the client. In this aspx page, we use the Content-Disposition header field without specifying the filename parameter.

Generally speaking, URL rewriting can be implemented either at the IIS level or the ASP.NET level, and the URL rewriting only happens at the ASP.NET level when the request is successfully routed from the IIS to the ASP.NET engine. As you know, only the requests for a page with an extension such as aspx, ascx, ashx… will be processed by the ASP.NET engine. Furthermore, we as developers have no idea about the type of files that the user uploads to the server, so URL rewriting at the IIS is likely the answer. Here in this article, I am not going to present how to implement URL rewriting at the IIS with ISAPI filters. However, there are a number of third-party ISAPI filters out there that can be used. For the demo, I am using the ISAPI_Rewrite Live version, it is simple and free. You can download the latest version from here.

As soon as the ISAPI_Rewrite is installed, we can define rewriting rules in the httpd.ini file that should appear in the ISAPI_Rewrite installation directory. For the article purpose, we just define one single rewriting rule:

RewriteRule (/Sample2/Download/.*?)
    (id=[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12})  
/Sample2/Download.aspx\?$2  [L]

Let me briefly explain this rule: when the user requests any file in the Download directory, the URL will be rewritten to the Download.aspx page that is responsible for reading the file and sending back to the client. Here, the RewriteRule directive is used to define one single rewriting rule. For example, the user downloads the report.doc file with the URL:

http://localhost/Sample2/Download/report.doc?id=647faba9-a223-45d3-90d5-3bc4de95bd39

At the IIS, the URL will be rewritten to:

http://localhost/Sample2/Download.aspx?id=647faba9-a223-45d3-90d5-3bc4de95bd39

With this solution, we are not required to provide any further filename processing in code while the problem is probably resolved. However, installing a third-party component in IIS might not interest people for some reasons, in this case, the “Encoded-word” mechanism may be the answer.

“Encoded-word” mechanism

As you know, the filename parameter is limited to US-ASCII. So if the filename contains any non-US-ASCII characters, it must be encoded to be exactly displayed in the File Download dialog box. RFC 2184, 2231 define extensions to the encoded-word mechanisms in RFC 2047 to provide a means to specify parameter values in character sets other than US-ASCII. The encoding mechanism is quite simple. For a specific filename, all non-US-ASCII characters as well as ones that are different from alphanumeric and reserved characters are replaced with %xx encoding, where xx is the hexadecimal value representing the character. Below is the code snippet used to encode a character:

C#
private static string ToHexString(char chr)
{
   UTF8Encoding utf8 = new UTF8Encoding();
   byte[] encodedBytes = utf8.GetBytes(chr.ToString());
   StringBuilder builder = new StringBuilder();
   for(int index=0; index<encodedBytes.Length; index++)
   {
      builder.AppendFormat("%{0}",Convert.ToString(encodedBytes[index], 16));
   }

   return builder.ToString();
}

For example, if the original filename is B&#7843;n Ki&#7875;m Kê.doc (to view the filename correctly, the Encoding should be chosen as Unicode (UTF-8) on your web browser), the encoded value is something like this B%e1%ba%a3n%20Ki%e1%bb%83m%20K%c3%aa.doc, and then Content-Disposition field is specified as below:

Content-Disposition:
attachment; filename=B%e1%ba%a3n%20Ki%e1%bb%83m%20K%c3%aa.doc

So in this way, we need to provide some code for encoding the filename before passing in to the Content-Disposition field, and in my opinion, it is a good choice because we do not need to install any third-party component. However, we should be aware that the “Encoded-word” mechanism only works if the existing MIME processor on the client side understands the encoded parameter values, otherwise the filename is not displayed correctly as we expect. In addition, according to RFCs 2231 and 2184, the extensions defined in these documents should not be used lightly, they should be reserved for situations where a real need for them exists. Fore more information, see RFCs 2231, 2184.

Running sample code

The download contains three web applications that are used to demonstrate the three solutions. To try out the demo applications, create three web virtual directories in IIS and point them at Sample1, Sample2, and Sample3. The start page should be the ListDocument.aspx page. For the Sample2, you also need to download and install the ISAPI_Rewrite and add the above rewriting rule into the httpd.ini file.

The ASPNET account is required to have read/write access on the Data and Download directories in each application, and for the sake of simplicity, I am using XML as a back-end data store.

Conclusion

At the moment, the filename parameter is limited to US-ASCII, so we need to provide a bit more work in order for the filename to be exactly displayed in the File Download dialog box. We hope that this limitation will be resolved someday and we are able to use non-US-ASCII characters as easily as US-ASCII ones.

References

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Web Developer
Vietnam Vietnam
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
Suggestiontried with spanish letter, didnt work for me, but good article Pin
Leonardo Paneque23-Nov-11 8:54
Leonardo Paneque23-Nov-11 8:54 
GeneralCrossBrowser solution Pin
igon_ghost4-Dec-09 4:12
igon_ghost4-Dec-09 4:12 
AnswerWhy not just set the HeaderEncoding property Pin
Magnus_10-May-09 22:53
Magnus_10-May-09 22:53 
AnswerThe solution, or as good as it gets Pin
java_osborn23-Apr-09 7:21
java_osborn23-Apr-09 7:21 
GeneralErr Pin
phucntbk20-Mar-09 23:03
phucntbk20-Mar-09 23:03 
GeneralFile Download Dialog Box Pin
There is always the way to do it, but I don't know22-Oct-07 10:59
There is always the way to do it, but I don't know22-Oct-07 10:59 
Questionfile name with ; Pin
lakshmi patil30-Aug-07 2:08
lakshmi patil30-Aug-07 2:08 
GeneralMixed filenames Pin
jonmy24-Aug-06 4:00
jonmy24-Aug-06 4:00 
GeneralFile name is shown incorrectly on opening Pin
3-Aug-05 10:36
suss3-Aug-05 10:36 
GeneralRe: File name is shown incorrectly on opening Pin
ernest_elias24-Jan-06 2:23
ernest_elias24-Jan-06 2:23 
GeneralRe: File name is shown incorrectly on opening Pin
ernest_elias24-Jan-06 5:29
ernest_elias24-Jan-06 5:29 
GeneralRe: File name is shown incorrectly on opening Pin
java_osborn17-May-07 11:04
java_osborn17-May-07 11:04 
GeneralRe: File name is shown incorrectly on opening Pin
darqer30-Aug-07 0:18
darqer30-Aug-07 0:18 
QuestionRe: File name is shown incorrectly on opening - Please Help!!! Pin
dancingintherain11-Aug-08 10:43
dancingintherain11-Aug-08 10:43 
AnswerRe: File name is shown incorrectly on opening - Please Help!!! Pin
Member 136575830-Jan-09 15:41
Member 136575830-Jan-09 15:41 
GeneralA simple Server.UrlEncode would solve this... Pin
C. Augusto Proiete7-Jul-05 23:23
C. Augusto Proiete7-Jul-05 23:23 
If it doesn't encode the spaces to '+' Frown | :(

The Microsoft's HttpUtility class has some very good functions we could use, but they doesn't allow us to call - they made them private Frown | :(

Below is the class I'm using (after playing with Reflector):

<br />
	public class HttpUtil<br />
	{<br />
		private HttpUtil()<br />
		{<br />
		}<br />
<br />
		private static bool IsSafe(char ch)<br />
		{<br />
			if ((((ch < 'a') || (ch > 'z')) && ((ch < 'A') || (ch > 'Z'))) && ((ch < '0') || (ch > '9')))<br />
			{<br />
				char ch1 = ch;<br />
				switch (ch1)<br />
				{<br />
					case ' ':<br />
					case '\'':<br />
					case '(':<br />
					case ')':<br />
					case '*':<br />
					case '-':<br />
					case '.':<br />
					case '!':<br />
					{<br />
						break;<br />
					}<br />
					case '+':<br />
					case ',':<br />
					{<br />
						goto exit_Function;<br />
					}<br />
					default:<br />
					{<br />
						if (ch1 != '_')<br />
						{<br />
							goto exit_Function;<br />
						}<br />
						break;<br />
					}<br />
				}<br />
			}<br />
			return true;<br />
			exit_Function:<br />
				return false;<br />
		}<br />
<br />
		private static char IntToHex(int n)<br />
		{<br />
			if (n <= 9)<br />
			{<br />
				return (char) ((ushort) (n + 0x30));<br />
			}<br />
			return (char) ((ushort) ((n - 10) + 0x61));<br />
		}<br />
<br />
		private static byte[] StringEncodeBytesToBytesInternal(byte[] bytes, int offset, int count, bool alwaysCreateReturnValue)<br />
		{<br />
			int num1 = 0;<br />
			int num2 = 0;<br />
			for (int num3 = 0; num3 < count; num3++)<br />
			{<br />
				char ch1 = (char) bytes[offset + num3];<br />
				if (ch1 == ' ')<br />
				{<br />
					num1++;<br />
				}<br />
				else if (!HttpUtil.IsSafe(ch1))<br />
				{<br />
					num2++;<br />
				}<br />
			}<br />
			if ((!alwaysCreateReturnValue && (num1 == 0)) && (num2 == 0))<br />
			{<br />
				return bytes;<br />
			}<br />
			byte[] buffer1 = new byte[count + (num2 * 2)];<br />
			int num4 = 0;<br />
			for (int num5 = 0; num5 < count; num5++)<br />
			{<br />
				byte num6 = bytes[offset + num5];<br />
				char ch2 = (char) num6;<br />
				if (HttpUtil.IsSafe(ch2))<br />
				{<br />
					buffer1[num4++] = num6;<br />
				}<br />
				else<br />
				{<br />
					buffer1[num4++] = 0x25;<br />
					buffer1[num4++] = (byte)HttpUtil.IntToHex((num6 >> 4) & 15);<br />
					buffer1[num4++] = (byte)HttpUtil.IntToHex(num6 & 15);<br />
				}<br />
			}<br />
			return buffer1;<br />
		}<br />
<br />
		private static byte[] StringEncodeToBytes(string str, Encoding e)<br />
		{<br />
			if (str == null)<br />
			{<br />
				return null;<br />
			}<br />
			byte[] buffer1 = e.GetBytes(str);<br />
			return HttpUtil.StringEncodeBytesToBytesInternal(buffer1, 0, buffer1.Length, false);<br />
		}<br />
<br />
		public static string StringEncode(string str, Encoding e)<br />
		{<br />
			if (str == null)<br />
			{<br />
				return null;<br />
			}<br />
			return Encoding.ASCII.GetString(HttpUtil.StringEncodeToBytes(str, e));<br />
		}<br />
<br />
		public static string StringEncode(string s)<br />
		{<br />
			return HttpUtil.StringEncode(s, Encoding.UTF8);<br />
		}<br />
	}<br />

C. Augusto Proiete
https://augustoproiete.net


modified 25-May-20 21:31pm.

GeneralRe: A simple Server.UrlEncode would solve this... Pin
minhpc_bk8-Jul-05 15:27
minhpc_bk8-Jul-05 15:27 
Generalit seems, that most browsers doesn't support Encoded-word mechanism. Pin
Anonymous12-May-05 0:44
Anonymous12-May-05 0:44 
GeneralRe: it seems, that most browsers doesn't support Encoded-word mechanism. Pin
minhpc_bk12-May-05 19:00
minhpc_bk12-May-05 19:00 
GeneralRe: it seems, that most browsers doesn't support Encoded-word mechanism. Pin
naspoon20-Feb-06 13:19
naspoon20-Feb-06 13:19 
Generalblank window Pin
vishalKhedkar11-Apr-05 23:58
vishalKhedkar11-Apr-05 23:58 
GeneralRe: blank window Pin
minhpc_bk17-Apr-05 21:25
minhpc_bk17-Apr-05 21:25 
GeneralFireFox Pin
2006 Flauzer6-Apr-05 6:34
professional2006 Flauzer6-Apr-05 6:34 
GeneralRe: FireFox Pin
minhpc_bk7-Apr-05 12:00
minhpc_bk7-Apr-05 12:00 
GeneralRe: FireFox PinPopular
minhpc_bk8-Apr-05 5:59
minhpc_bk8-Apr-05 5:59 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.