php - 如何使用 PHP 从远程网页中提取隐藏内容？

Question

我想用 php (file_get_contents?) 阅读一个网站，该网站用 隐藏。

4个例子：

Uwsebvrfahr
Zeinhvöhrdorf
Babeneinhv伯格
Ksddbnvbhgaweaoihvwsaoirasudasuirchdorf/Kr.

结果应该是：

乌尔法尔
佐尔多夫
巴本贝格
基希多夫/Kr.

解决问题的两种可能方法（但我不知道如何实现它们）：
A）删除所有跨度标签及其内容
B）以编程方式只读 VISIBLE 内容

非常感谢您的帮助！！！

score 1 · Accepted Answer

http://sourceforge.net/projects/simplehtmldom/files/latest/download?source=files

include('simple_html_dom.php');

$html = file_get_html('http://www.fussballoesterreich.at/netzwerk/datenservice/379402779304830775_O~733830065019629299~744933674800963515~0~1.htm');

$i = 1;
foreach($html->find('.mannschaft a') as $e)
{
    $x = html_entity_decode($e->innertext, ENT_QUOTES, 'UTF-8');
    $x = preg_replace('#<(.*)>#', '', $x);
    echo $i, '. ', $x, '<br />';
    $i++;
}

结果：

1. Garsten
2. S. Valent.ASK
3. Bumgartenberg
4. Neuhofen/Krems
5. Admira
6. Asten
7. Enns
8. Pasching 1b
9. S. Florian 1b
10. SValentin SC
11. Hörsching
12. S Ulrich
13. Wdischgarsten
14. Doppl-Hart

我在这里的工作已经完成。

score 0 · Accepted Answer

应用样式这一事实没有任何区别。对于 PHP，它只是一堆文本。

尝试：

<?php
$url = 'http://....';  // URL you're scraping.
$html = file_get_contents($url);
$text = strip_tags($html);
echo "<PRE>$text</PRE>";

php - 如何使用 PHP 从远程网页中提取隐藏内容？

2 回答 2

Related

Reference