I'm trying to read html code from a URL Connection. In one case the html file I'm trying to read includes 5 line breaks before the actual doc type declaration. In this case the input reader throws an exception for EOF.
URL pageUrl =
new URL(
URLConnection getConn = pageUrl.openConnection();
DataInputStream dis = new DataInputStream(getConn.getInputStream());
//some read method here
Has anyone ran into a problem like this?
URL pageUrl = new URL("http://www.nytimes.com/2011/03/15/sports/basketball/15nbaround.html");
URLConnection getConn = pageUrl.openConnection();
DataInputStream dis = new DataInputStream(getConn.getInputStream());
String urlData = "";
while ((urlData = dis.readUTF()) != null)
//exception thrown
java.io.EOFException at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:323) at java.io.DataInputStream.readUTF(DataInputStream.java:572) at java.io.DataInputStream.readUTF(DataInputStream.java:547)
in the case of bufferedreader, it just responds null and doesn't continue
pageUrl = new URL("http://www.nytimes.com/2011/03/15/sports/basketball/15nbaround.html");
URLConnection getConn = pageUrl.openConnection();
BufferedReader br = new BufferedReader(new InputStreamReader(getConn.getInputStream()));
String urlData = "";
urlData = br.readLine();
outputs null