0

我有一个脚本可以解析一个页面的 23 页并获取目录的名称,但是连接会重置并超时。是因为for循环中的这个嵌套foreachloop吗?

<?php

header('Content-type: text/html; charset=utf-8'); // this just makes sure encoding is      right
 include('simple_html_dom.php'); // the parser library

// you were trying to parse the wrong link.. your previous link did not have <div> tag with commentText class .. I chose a random link.. choose link for whichever professor you like or grab the links of professor from previous page store it in an array and loopr through them to get comments
 $i=1;
for($i; $i < 23;$i++){
$html = file_get_html("http://www.ratemyprofessors.com/SelectTeacher.jsp?sid=834&pageNo=$i"); // the url for the teacher rating profile

 //your div tag has class "comment" not "commentText"
foreach($html->find("div[class=profName]") as $content){
 echo $content->plaintext;
 echo "<br >";  
  }
 } 


?>
4

1 回答 1

0

我建议您使用YQL,它非常快,并且可以防止您的 IP 被列入黑名单...

将它与 PHP 一起使用是相当简单的 -看这里

祝你好运!

于 2013-05-11T11:02:08.400 回答