character-encoding - 通过 apache HTTPClient 传递特殊字符

Question

我有一个 servlet，它接受 HTML 内容作为请求参数的一部分。HTML 是本地化的，可能是法语、西班牙语等...内容。我还使用 apache HTTP 客户端向该 servlet 发出请求以进行测试，该 servlet 具有以下标头定义：

HttpClient client = new HttpClient();

PostMethod method = new PostMethod("<URL>");
String html = FileUtils.readFileToString(inputHTMLFile, "UTF-8");
method.addParameter("html", html);

method.addRequestHeader("Accept", "*/*");    
method.setRequestHeader("accept-charset", "UTF-8");

读取的任何 HTML 都具有 utf-8 字符编码，示例文本：

Télécharger un fichier

但是，当我从文本变为的请求参数中获取 html 时T?l?charger un fichier

我浏览了一些链接，例如http://www.oracle.com/technetwork/articles/javase/httpcharset-142283.html，其中讨论了字符集以及浏览器通常如何编码特殊字符。如果我要使用 UTF-8 对 html 进行 URLEncode，然后在 servlet 中使用相同的字符集对其进行解码，我会得到预期的 HTML。

这是我唯一能做的来保存字符集吗？我错过了什么吗？

谢谢。

score 6 · Accepted Answer

既然文件本身的问题已经解决，请尝试按如下方式修改您的代码：

 HttpClient client = new HttpClient();
 PostMethod postMethod = new PostMethod("<URL>");
 postMethod.getParams().setContentCharset("utf-8"); //The line I added

 ...

请注意，客户端现在需要将请求解码为 UTF-8。法语和西班牙语工作正常，因为它们的字符包含在默认的 ISO-8859-1 字符集中。汉字不是。如果法语和西班牙语在客户端被正确解码，则客户端将请求解码为 ISO-8859-1，发送 UTF-8 可能会失败。

所以你也可以尝试添加这个：

postMethod.setRequestheader("Content-Type", "application/x-www-form-url-encoded; charset=utf-8");

score 4 · Accepted Answer

试试这个post方法。

HttpPost request = new HttpPost(webServiceUrl);
StringEntity str = new StringEntity(YourData);
str.setContentType("application/json");
HttpPost.setEntity(new StringEntity(str, HTTP.UTF_8));

score 2 · Accepted Answer

2

您最好将字符串更改为 base64 编码，然后发送。

于 2019-02-09T12:40:54.673 回答

score 1 · Accepted Answer

我无法通过 HttpClient 套接字连接发送希伯来语。通过时将其更改为垃圾。我已经完成了上面提到的所有要点。尽管如此，问题仍然存在。

score 0 · Accepted Answer

我想我已经通过检查 EntityBuilder 反编译代码找到了原因：EntityBuilder 忽略了关于参数的 contentEncoding 字段，它使用了 contentType 字段中的那个。通过查看 org.apache.http.entity.ContentType，唯一一个具有 UTF-8 的预定义值是 org.apache.http.entity.ContentType.APPLICATION_JSON。

所以在我的情况下

HttpPost method = new HttPost("<URL>");
EntityBuilder builder = EntityBuilder.create();
builder.setContentType(ContentType.APPLICATION_JSON);
builder.setContentEncoding(StandardCharsets.UTF_8.name());
...
method.setEntity(builder.build());

完成了这项工作（尽管我认为在这里设置 contentType 是多余的）。

我正在使用 httpclient-osgi 4.5.4 版。

score -1 · Accepted Answer

PostMethod method = new PostMethod("URL");
method.setRequestHeader("Content-Type", "application/x-www-form-urlencoded; charset=UTF-8");

character-encoding - 通过 apache HTTPClient 传递特殊字符

6 回答 6

Related

Reference