0

我使用了一个简单的 file_get_contents 函数,但没有得到它的实际内容(输出)..

我无法弄清楚错误!

代码:

<?php 

// $url = $_GET['url'];

// $flv_http_path = urlencode($url);

 $flv_http_path = 'http://r12.bhartibb-maa1.c.youtube.com/videoplayback?ip=0.0.0.0&sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Calgorithm%2Cburst%2Cfactor%2Coc%3AU0dXSlBSUl9FSkNNN19ITFZB&algorithm=throttle-factor&itag=34&ipbits=0&burst=40&sver=3&expire=1285074000&key=yt1&signature=3E1E4994130745C392FA479F6ACCE5F40E703A2C.A87325A1DCB178B04FD89A9DEEE811CDCB08157C&factor=1.25&id=8b2fd4fd9ac2f09f&st=lc';

 echo "----$flv_http_path------";


 $data = file_get_contents($flv_http_path);

 echo "$data";

 if($data)
    echo "data is avail";
 else
    echo "data not available";

// $new_flv_path = dirname(_FILE_).'/flvs/sample.flv' ;

 $new_flv_path = '/home/public_html/temp/sample.flv' ;

 if(file_put_contents($new_flv_path, $data))
    return $new_flv_path ;
 else
 {
    echo "else part ";
    return false;
 }

?>

我从 youtube 视频的响应标头中获得了该网址

我得到的标题是

http://v3.lscache1.c.youtube.com/videoplayback?ip=0.0.0.0&sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Calgorithm%2Cburst%2Cfactor%2Coc%3AU0dXSlBTVl9FSkNNN19ITVpF&algorithm=throttle-factor&itag=34&ipbits=0&burst=40&sver=3&expire=1285088400&key=yt1&signature=536A81F10AA43A4E015BB05FA182A9A966047C3C.C22269E2E1ECFC2C2DE7A8A45BA2C3DF7CF1EC08&factor=1.25&id=fd61d32bbbd1be5e&

GET /videoplayback?ip=0.0.0.0&sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Calgorithm%2Cburst%2Cfactor%2Coc%3AU0dXSlBTVl9FSkNNN19ITVpF&algorithm=throttle-factor&itag=34&ipbits=0&burst=40&sver=3&expire=1285088400&key=yt1&signature=536A81F10AA43A4E015BB05FA182A9A966047C3C.C22269E2E1ECFC2C2DE7A8A45BA2C3DF7CF1EC08&factor=1.25&id=fd61d32bbbd1be5e& HTTP/1.1
Host: v3.lscache1.c.youtube.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1) Gecko/20090616 Firefox/3.5
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: VISITOR_INFO1_LIVE=9CH-GUrsSEQ; __utma=27069237.1455305642.1275034254.1279868001.1280568792.6; __utmz=27069237.1279868001.5.2.utmcsr=google.com|utmccn=(referral)|utmcmd=referral|utmcct=/landing/youtube/lifeinaday/; watched_video_id_list_kvijayhari=7b1d7ce3852b9aca07a985813b83aaa6WxMAAABzCwAAADFuNzRnSExwU0M4cwsAAAB2ajgxNXlQNDFMQXMLAAAARWNjZ0lLdHVDM1lzCwAAAHFHZFo5elhoQ0ZvcwsAAAB0WXMwTXhvbTRjSXMLAAAAYUdBdDZwNGh0c2NzCwAAAGR2V25wMjdBSGZvcwsAAABtNDBhbG1SQzNzSXMLAAAANjhVT1BhTUtwOTBzCwAAADZnaFUxWDBqdVM4cwsAAABiRy0xYTRsUnlEMHMLAAAAWjh5OFFDRFNUQ29zCwAAADY0T0w3NzhBeUlFcwsAAABzQkl1OWpnSWtwQXMLAAAASllYM08wWEEteWdzCwAAAF95WGxpc0g4dkF3cwsAAABzcXZCSXdDMWxtWXMLAAAAaEMzd09EU0U5MHdzCwAAAGZaODhxaHduTVow; auto_translation=b901c47ed36700682e23d64062529856cwQAAAB0cnVl; PREF=f1=50000000&f2=2000&emt=iceberg&ftuc=32&ems=hd720&HIDDEN_MASTHEAD_ID=brO_JIa6RTI; use_hitbox=72c46ff6cbcdb7c5585c36411b6b334edAEAAAAw; GEO=489e10e70a42c0dfed7513e1895ffe1bcwsAAAAzSU56spxTTJhEAw==; watched_video_id_list=2aa4a241cbdc35137f13b3513ea3e653WwQAAABzCwAAAF9XSFRLN3ZSdmw0cwsAAABpeV9VX1pyQzhKOHMLAAAAd3ZsTUFKLVU2SEVzCwAAAENaQmpoVGQ0WjlN

HTTP/1.0 200 OK
Last-Modified: Sun, 20 Jun 2010 03:59:10 GMT
Content-Type: video/x-flv
Date: Tue, 21 Sep 2010 10:05:34 GMT
Expires: Tue, 21 Sep 2010 16:55:00 GMT
Cache-Control: public, max-age=24566
Content-Length: 4077907
Accept-Ranges: bytes
X-Content-Type-Options: nosniff
Server: gvs 1.0
X-Cache: MISS from localhost.localdomain
X-Cache-Lookup: MISS from localhost.localdomain:3128
Via: 1.0 localhost.localdomain:3128 (squid/2.6.STABLE6)
Connection: keep-alive
4

6 回答 6

1

检查您的网址。

当我将您的网址放在浏览器中时,它什么也没提供,因此file_get_contents返回一个空字符串。

您需要检查file_get_contentsas 的输出:

if($data !== false)

代替

if($data)
于 2010-09-21T09:57:11.433 回答
1

我还收到一个 HTTP 响应 500。为了爬取 Youtube,您可能必须欺骗调用的 User-Agent 和其他措施,以防止 Youtube 将您识别为爬虫。

于 2010-09-21T09:58:56.017 回答
0

HTTP 403在以下位置得到一个:

http://r12.bhartibb-maa1.c.youtube.com/videoplayback?ip=0.0.0.0&sparams=id,expire,ip,ipbits,itag,algorithm,burst,factor,oc:U0dXSlBSUl9FSkNNN19ITFZB&algorithm=throttle-factor&itag=34&ipbits =0&burst=40&sver=3&expire=1285074000&key=yt1&signature=3E1E4994130745C392FA479F6ACCE5F40E703A2C.A87325A1DCB178B04FD89A9DEEE811CDCB08157C&factor=1.25&id=8bdacstfdf4

响应标头:

内容类型:文本/纯文本

日期:2010 年 9 月 21 日星期二 09:59:13 GMT

代理连接:关闭

服务器:gvs 1.0

通过:1.0 proxy3@XXXXX.sch.uk:8080 (squid/2.6.STABLE19), 1.0 wcsproxy.XXXX.org.uk:8080 (squid/2.6.STABLE19)

X-Cache:来自 proxy3@XXX.sch.uk 的 MISS,来自 wcsproxy.XXX.org.uk 的 MISS

X-Content-Type-Options:nosniff

于 2010-09-21T10:01:53.940 回答
0

好吧,当我尝试加载您在其中引用的 URL 时,$flv_http_path我得到了:

HTTP/1.1 403 Forbidden
Content-Type: text/plain
Connection: close
X-Content-Type-Options: nosniff
Date: Tue, 21 Sep 2010 09:57:19 GMT
Server: gvs 1.0

作为回报。

那应该给你一个线索:)

如果那不是您尝试打开的实际文件,并且您实际上并没有尝试抓取 youtube,您应该尝试将 url 包装在 urlencode()编辑中:但是 url 已经是 urlencoded(呃!)

“如果您要打开一个带有特殊字符(例如空格)的 URI,则需要使用 urlencode() 对 URI 进行编码。” -- http://www.php.net/manual/en/function.file-get-contents.php

于 2010-09-21T10:02:13.903 回答
0

链接为空。在浏览器中触发链接并检查源代码。没有数据。

于 2010-09-21T10:03:36.243 回答
0

这是 youtube 阻止您自动获取他们的 flv 文件的方式。

您无法从服务器获取文件,因为下载链接(您从浏览器获得,或者您如何找到 flv 链接)已锁定到您的浏览器。

这就是为什么当您以外的其他人尝试调用该链接时,我们都会得到 403 HTTP 禁止,即使使用欺骗性的用户代理也是如此。

尝试使用 cURL 并显示标题,你会明白我的意思。

于 2010-09-21T10:12:16.740 回答