URL url = new URL("http://www.example.com/data.php?q=%FD");
logger.info("url: " + url);
URI uri = url.toURI();
logger.info("uri ASCII: " + uri.toASCIIString());
logger.info("uri str : " + uri.toString());
logger.info("query : " + uri.getQuery());
logger.info("decoded : " + URLDecoder.decode(ur.getRawQuery(), "WINDOWS-1252"));
String scheme = uri.getScheme();
String auth = uri.getAuthority();
String path = uri.getPath();
String query = uri.getQuery();
URI cleanedURI = new URI(scheme, auth, path, query, null);
logger.info("cleaned uri ASCII: " + cleanedURI.toASCIIString());
logger.info("cleaned uri str : " + cleanedURI.toString());
输出是:
url: http://www.example.com/data.php?q=%FD
uri ASCII: http://www.example.com/data.php?q=%FD
uri str : http://www.example.com/data.php?q=%FD
query: q=�
decoded: q=ý
cleaned uri ASCII: http://www.example.com/data.php?q=%EF%BF%BD
cleaned uri str : http://www.example.com/data.php?q=�
因此,当我将 URI 拆分为多个部分,然后再次构造时,我无法取回原始 URL。如何取回原始 URL,它是正确百分比编码的有效 URL。
我需要获取原始的 %3F,而不是获取 %EF%BF%BD。
(实际上我想要实现的是以干净的方式操作 URL 的某些部分,例如删除片段,但这与我的问题没有太大关系。)