java - Java 获取网页源代码开头包含“null”

Question

由于某些奇怪的原因，当我尝试使用 URLConnection 获取网页源时，我在输出中得到一个“null”。任何人都可以阐明一下吗？

我的方法：

public String getPageSource()
        throws IOException
{
    URL url = new URL( this.getUrl().contains( "http://" ) ? this.getUrl() : "http://" + this.getUrl() );
    URLConnection urlConnection = url.openConnection();

    BufferedReader br = new BufferedReader( new InputStreamReader( urlConnection.getInputStream(), "UTF-8" ) );

    String source = null;
    String line;

    while ( ( line = br.readLine() ) != null )
    {
        source += line;
    }

    return source;
}

我怎么称呼它：

public static void main( String[] args )
        throws IOException
{
    WebPageUtil wpu = new WebPageUtil( "www.something.com" );

    System.out.println( wpu.getPageSource();
}

WPU 构造函数：

public WebPageUtil( String url )
{
    this.url = url;
}

输出总是这样的：

null<html><head>... //and then the rest of the source code, which is scraped correctly

没什么难的，对吧？但是那个该死的“空”是从哪里来的？！

谢谢你的建议！

score 2 · Accepted Answer

您正在初始化String源将一个值，因此它的值在循环中的第一个连接null上被转换为文字“null” 。Stringwhile

String改用空的

String source = "";

或者更好地使用StringBuilder.

java - Java 获取网页源代码开头包含“null”

1 回答 1

Related

Reference