internet-explorer - 文件下载时文件名损坏 (IE)

Question

我已经实现了一个简单的文件上传-下载机制。当用户单击文件名时，将使用以下 HTTP 标头下载该文件：

HTTP/1.1 200 OK
Date: Tue, 30 Sep 2008 14:00:39 GMT
Server: Microsoft-IIS/6.0
Content-Disposition: attachment; filename=filename.doc;
Content-Type: application/octet-stream
Content-Length: 10754

我也支持日文文件名。为此，我使用此 java 方法对文件名进行编码：

private String encodeFileName(String name) throws Exception{
    String agent = request.getHeader("USER-AGENT");
    if(agent != null && agent.indexOf("MSIE") != -1){ // is IE
        StringBuffer res = new StringBuffer();
        char[] chArr = name.toCharArray();
        for(int j = 0; j < chArr.length; j++){
            if(chArr[j] < 128){ // plain ASCII char
                if (chArr[j] == '.' && j != name.lastIndexOf("."))
                    res.append("%2E");
                else
                    res.append(chArr[j]);
            }
            else{ // non-ASCII char
                byte[] byteArr = name.substring(j, j + 1).getBytes("UTF8");
                for(int i = 0; i < byteArr.length; i++){
                    // byte must be converted to unsigned int
                    res.append("%").append(Integer.toHexString((byteArr[i]) & 0xFF));
                }
            }
        }
        return res.toString();
    }
    // Firefox/Mozilla
    return MimeUtility.encodeText(name, "UTF8", "B");
}

到目前为止它运行良好，直到有人发现它不适用于长文件名。例如：あああああああああああああああ2008.10.1あ.doc。如果我将其中一个单字节点更改为单字节下划线，或者如果我删除了第一个字符，它可以正常工作。即，它取决于点字符的长度和 URL 编码。以下是几个例子。

这是坏的（あああああああああああああああ2008.10.1あ.doc）：

Content-Disposition: attachment; filename=%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%822008%2E10%2E1%e3%81%82.doc;

这没关系（あああああああああああああああ2008_10.1あ.doc）：

Content-Disposition: attachment; filename=%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%822008_10%2E1%e3%81%82.doc;

这也很好（あああああああああああああああ2008.10.1あ.doc）：

Content-Disposition: attachment; filename=%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%82%e3%81%822008%2E10%2E1%e3%81%82.doc;

有人有线索吗？

score 6 · Accepted Answer

gmail 处理文件名转义的方式有所不同：文件名被引用（双引号），单字节句点不是 URL 转义的。这样，问题中的长文件名就OK了。

Content-Disposition: attachment; filename="%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%82%E3%81%822008.10.1%E3%81%82.doc"

但是，文件名的字节长度仍然存在限制（显然仅限 IE）（我假设是一个错误）。因此，即使文件名仅由单字节字符组成，文件名的开头也会被截断。限制约为 160 字节。

score 2 · Accepted Answer

如上所述，如果没有浏览器嗅探并为每个浏览器返回不同的标头，Content-Disposition 和 Unicode 是不可能让所有主要浏览器工作的。

我的解决方案是完全避免 Content-Disposition 标头，并将文件名附加到 URL 的末尾，以诱使浏览器认为它正在直接获取文件。例如

http://www.xyz.com/cgi-bin/dynamic.php/あああああああああああああああ2008.10.1あ.doc

这自然假设您在创建链接时知道文件名，尽管快速重定向标头可以按需设置。

score 1 · Accepted Answer

这里的主要问题是 IE 不支持相关的 RFC，这里是：RFC2231。请参阅指针和测试用例。此外，用于 IE 的解决方法（仅使用百分比转义的 UTF-8）还有几个问题；它可能不适用于所有语言环境（据我记得，该方法在韩国失败，除非 IE 配置为始终在 URL 中使用 UTF-8，这不是默认设置），并且如前所述，存在长度限制（我听说这在 IE8中已修复，但我还没有尝试）。

score -2 · Accepted Answer

-2

我认为这个问题在 IE8 中已修复，我已经看到它在 IE 8 中工作。

于 2010-08-16T06:52:02.180 回答

internet-explorer - 文件下载时文件名损坏 (IE)

4 回答 4

Related

Reference