我知道这个问题被问了很多次,但是我被这个问题困住了,我读过的任何东西都没有帮助我。
我有这个代码:
BufferedReader reader = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line;
while((line = reader.readLine()) != null)content += line+"\r\n";
reader.close();
我正在尝试获取此网页的内容http://www.garazh.com.ua/tires/catalog/Marangoni/E-COMM/description/并且所有非拉丁符号都显示错误。
我尝试设置编码,如:
BufferedReader reader = new BufferedReader(new InputStreamReader(conn.getInputStream(), "WINDOWS-1251"));
在这一点上一切都很好!但我无法更改我尝试解析的每个网站的编码,我需要一些解决方案。
所以伙计们,我知道检测编码并不像看起来那么容易,但我真的需要它。如果有人有这样的问题,请解释一下你是如何解决的!
任何帮助appriciated!
这是我用来获取内容的函数的完整代码:
protected Map<String, String> getFromUrl(String url){
Map<String, String> mp = new HashMap<String, String>();
String newCookie = "", redirect = null;
try{
String host = this.getHostName(url), content = "", header = "", UA = this.getUA(), cookie = this.getCookie(host, UA), referer = "http://"+host+"/";
URL U = new URL(url);
URLConnection conn = U.openConnection();
conn.setRequestProperty("Host", host);
conn.setRequestProperty("User-Agent", UA);
conn.setRequestProperty("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
conn.setRequestProperty("Accept-Language", "ru-ru,ru;q=0.8,en-us;q=0.5,en;q=0.3");
conn.setRequestProperty("Accept-Encoding", "gzip,deflate");
conn.setRequestProperty("Accept-Charset", "utf-8;q=0.7,*;q=0.7");
conn.setRequestProperty("Keep-Alive", "115");
conn.setRequestProperty("Connection", "keep-alive");
conn.setRequestProperty("Connection", "keep-alive");
if(referer != null)conn.setRequestProperty("Referer", referer);
if(cookie != null && !cookie.contentEquals(""))conn.setRequestProperty("Cookie", cookie);
for(int i=0; ; i++){
String name = conn.getHeaderFieldKey(i);
String value = conn.getHeaderField(i);
if(name == null && value == null)break;
else if(name != null)if(name.contentEquals("Set-Cookie"))newCookie += value + " ";
else if(name.toLowerCase().trim().contentEquals("location"))redirect = value;
header += name + ": " + value + "\r\n";
}
if(!newCookie.contentEquals("") && !newCookie.contentEquals(cookie))this.setCookie(host, UA, newCookie.trim());
try{
BufferedReader reader = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line;
while((line = reader.readLine()) != null)content += line+"\r\n";
reader.close();
}
catch(Exception e){/*System.out.println(url+"\r\n"+e);*/}
mp.put("url", url);
mp.put("header", header);
mp.put("content", content);
}
catch(Exception e){
mp.put("url", "");
mp.put("header", "");
mp.put("content", "");
}
if(redirect != null && this.redirectCount < 3){
mp = getFromUrl(redirect);
this.redirectCount++;
}
return mp;
}