0

我正在以这种方式阅读网页内容:

$doc->loadHTMLFile($url);
libxml_clear_errors();
$xpath = new DOMXPath($doc);      // PLease dont suggest file_get_Content($url) I cant use due to some issue
foreach($xpath->query("//script") as $script) {
    $script->parentNode->removeChild($script);
}
$textContent = $doc->textContent;

接着

echo $textContent;

我将此页面的响应放入一个 div 以显示$textContent.

但整个文本显示在单个段落中。不像往常那样在页面上。我尝试使用 <pre>标签。但它以单行显示全部内容。

那么如何以网页中的原始形式显示它呢?

当前输出:

Investor Herd Dynamics Want to start a startup? Get funded by Y Combinator. August 2013The biggest component in most investors'\'' opinion of you is the opinion of other investors. Which is of course a recipe for exponential growth. When one investor wants to invest in you, that makes other investors want to, which makes others want to, and so on.Sometimes inexperienced founders mistakenly conclude that manipulating these forces is the essence of fundraising. They hear stories about stampedes to invest in successful startups, and think it'\''s therefore the mark of a successful startup to have this happen. But actually the two are not that highly correlated. Lots of startups that cause stampedes end up flaming out (in extreme cases, partly as a result of the stampede), and lots of very successful startups were only moderately popular with investors the first time they raised money.So the point of this essay is not to explain how to create a stampede, but merely to explain the forces that generate them. These forces are always at work to some degree in fundraising, and they can cause surprising situations. If you understand them, you can at least avoid being surprised.One reason investors like you more when other investors like you is that you actually become a better investment. Raising money decreases the risk of failure. Indeed, although investors hate it, you are for this reason justified in raising your valuation for later investors. The investors who invested when you had no money were taking more risk, and are entitled to higher returns. Plus a company that has raised money is literally more valuable. After you raise the first million dollars, the company is at least a million dollars more valuable, because it'\''s the same company as before, plus it has a million dollars in the bank. [1]Beware, though, because later investors so hate to have the price raised on them that they resist even this self-evident reasoning. Only raise the price on an investor you'\''re comfortable with losing, because some will angrily refuse. [2]The second reason investors like you more when you'\''ve had some success at fundraising is that it makes you more confident, and an investors'\'' opinion of you is the foundation of their opinion of your company. Founders are often surprised how quickly investors seem to know when they start to succeed at raising money. And while there are in fact lots of ways for such information to spread among investors, the main vector is probably the founders themselves. Though they'\''re often clueless about technology, most investors are pretty good at reading people. When fundraising is going well, investors are quick to sense it in your increased confidence. (This is one case where the average founder'\''s inability to remain poker-faced works to your advantage.)But frankly the....
4

1 回答 1

1

您只获取 HTML 的内部文本(即 textContent)的原因是因为这就是 textContent 的工作方式。相反,您希望通过将代码更改为如下所示来获取节点的 innerHTML:

$doc->loadHTMLFile($url);

libxml_clear_errors();

$xpath = new DOMXPath($doc);      // PLease dont suggest file_get_Content($url) I cant use due to some issue

foreach($xpath->query("//script") as $script) {
    $script->parentNode->removeChild($script);
}

$innerHTML = '';

foreach ($doc->childNodes as $child)
{
    $innerHTML .= $child->ownerDocument->saveHTML($child);
}

现在您可以随心所欲地回显$innerHTML,它将包含所有 HTML。

于 2013-09-30T09:45:22.080 回答