0

我正在尝试从关联数组中的网站抓取 html 代码。我用 Zend_Dom_Query 试过了。

例子:

<div class="job">
   <div class="jobTitle">
    <a href="http://website.com/Job-Title-1">Job-Title-1</a>
   </div>
   <div class="company">
   <a href="http://website.com/Company-1">Company-1</a>
   </div>
   <div class="city">
   <a href="http://website.com/City-1">City-1</a>
   </div>
</div>
<div class="job">
    <div class="jobTitle">
    <a href="http://website.com/Job-Title-2">Job-Title-2</a>
    </div>
    <div class="company">
       <a href="http://website.com/Company-2">Company-2</a>
   </div>
   <div class="city">
      <a href="http://website.com/City-2">City-2</a>
   </div>
</div>

我如何从上面的 html 中获取关联数组?

 $dom = new Zend_Dom_Query($html);
 $links = $dom->query('div.jobTitle a');
 $companies = $dom->query('div.company');
 $cities = $dom->query('div.city');

 //result needed
 $result_array = array( array( link => 'http://website.com/Job-Title-1', 
         Company => 'Company-1', 
         City => 'City-1'
        ),
      array( link => 'http://website.com/Job-Title-2', 
         Company => 'Company-2', 
         City => 'City-2'
        )
     );
4

1 回答 1

0
    $dom=new Zend_Dom_Query($html);
    $links=$dom->query('div.jobTitle a');
    $companies=$dom->query('div.company');
    $cities=$dom->query('div.city');

        $data=[];
    foreach ($links as $link){
        $data[]=[
           'link'=> $link->getAttribute('href'),
           'Company'=>trim($companies->current()->textContent),
           'City'=>trim($cities->current()->textContent)
           ];
        $companies->next();
        $cities->next();
    }
    var_dump($data);
于 2013-05-15T13:49:00.317 回答