-2

I am trying to extract some information from each URL using XPath and PHP. It is important that something is printed for every URL even if nothing is returned from the XPath query. As a result, I tried to setup my script to print out N/A where no results are returned by XPath. However, this else clause is never entered and N/A is never printed.

The scrape.txt contains 50 URLs. Results are returned for 47/50 URL's. I am not concerned about my XPath query, but more so about the script itself returning some value for every URL attempted.

Can someone help me identify why this is happening and help me come up with a way to guarantee some string is printed regardless of whether or not there are results returned from the XPath query?

I'd appreciate any suggestions. Many thanks in advance!

$file = fopen('scrape.txt', "r");

$output = array();

while(!feof($file)){
    $line = fgets($file);

    $doc = new DOMDocument();
    $doc->loadHTMLFile($line);

    $XPath = new DOMXPath($doc);

    $elements = $XPath->query("//ul/li[1]/a[@class='geMain']");

    if (!is_null($elements)) {
        foreach ($elements as $element) {
            $nodes = $element->childNodes;
            foreach ($nodes as $node) {
                if(strcmp($node->nodeValue, "")!=0){
                    $output[] = trim($node->nodeValue);
                }
            }
        }
    }else{
        $output[] = "N/A";
    }   
}
array2csv($output);
4

2 回答 2

1

You could try the following, although I am not sure I completely understand what the nature of the problem really is:

$file = fopen('scrape.txt', "r");

$output = array();

while(!feof($file)){
    $line = fgets($file);

    $doc = new DOMDocument();
    $doc->loadHTMLFile($line);

    $XPath = new DOMXPath($doc);

    $elements = $XPath->query("//ul/li[1]/a[@class='geMain']");

    $haveOutput = false;
    if (!is_null($elements)) {
        foreach ($elements as $element) {
            $nodes = $element->childNodes;
            foreach ($nodes as $node) {
                if(strcmp($node->nodeValue, "")!=0){
                    $output[] = trim($node->nodeValue);
                    $haveOutput = true;
                }
            }
        }
    }

    if (!$haveOutput) {
        $output[] = "N/A";
    }   
}
array2csv($output);
于 2013-06-29T18:35:22.633 回答
1

DOMXpath->query返回一个DOMNodeList元素,无论是否有结果。测试其length值:

if ($elements->length == 0) {
  // No results found
} else {
  foreach ($elements as $element) {
    // for each result
  } 
}
于 2013-06-29T18:32:57.890 回答