0

Hello I'm making a php script to extract videos URL from Youtube result. I have this:

<?php
    error_reporting(1);

    function conseguir_codigo_url($url) {
        $dwnld = curl_init();
        curl_setopt($dwnld, CURLOPT_URL, $url);
        curl_setopt($dwnld, CURLOPT_HEADER, 0);
        //$userAgent = 'Mozilla/4.0 (compatible; MSIE 6.01; Windows NT 6.0)';
        curl_setopt($dwnld, CURLOPT_USERAGENT, $userAgent);
        curl_setopt($dwnld, CURLOPT_RETURNTRANSFER, true);

        $fuente_url = curl_exec($dwnld);
        curl_close($dwnld);
        return $fuente_url;
    }

    function extraer_atributo_elemento($fuente) {
        $file = new DOMDocument;

        if($file->loadHTML($fuente) and $file->validate()){

            echo "DOCUMENTO";

            $file->getElementById("search-results");

        }

     $codigo_url = conseguir_codigo_url("http://www.youtube.com/results?search_sort=video_date_uploaded&uni=3&search_type=videos&search_query=humor");
    extraer_atributo_elemento($codigo_url);
?>

The trouble is I can't use getelementbyid, I think it's maybe html5. Have you a suggestions to solve this. I need parse the source and I don't know regex . So domdocument is the only way.

4

1 回答 1

1

你为什么用$file->validate()?如果你只是想通过 ID 提取元素,不需要调用它。此外,在调用之前设置DOMDocument::recover为可能有助于解析来自网络的损坏的 HTML。trueloadHTML

于 2013-01-13T15:48:21.097 回答