我正在做一个视频分析项目,该项目需要从 youtube 下载视频并将其上传到谷歌云存储。我想不出一种直接将它们上传到 gcs 的方法,因此我尝试将它们下载到本地计算机上,然后将它们上传到 gcs。
我浏览了有关 stackoverflow 的多篇文章,在这些文章的帮助下,我能够想出以下脚本。
我浏览了有关 stackoverflow 的多篇文章,例如 python:get all youtube video urls of a channel and
在那些人的帮助下,我想出了以下脚本。
import urllib.request
import json
from pytube import YouTube
import pickle
def get_all_video_in_channel(channel_id):
api_key = 'AIzaSyCK9eQlD1ptx0SKMsmL0srmL2ua9_EuwSs'
base_video_url = 'https://www.youtube.com/watch?v='
base_search_url = 'https://www.googleapis.com/youtube/v3/search?'
first_url = base_search_url+'key={}&channelId={}&part=snippet,id&order=date&maxResults=25'.format(api_key, channel_id)
video_links = []
url = first_url
while True:
inp = urllib.request.urlopen(url)
resp = json.load(inp)
for i in resp['items']:
if i['id']['kind'] == "youtube#video":
video_links.append(base_video_url + i['id']['videoId'])
try:
next_page_token = resp['nextPageToken']
url = first_url + '&pageToken={}'.format(next_page_token)
except:
break
return video_links
#Load the file containing all the youtube video url
load_url = get_all_video_in_channel(channel_id)
#Access all the youtube url in the list and store them on local machine. Need to figure out if there is a way to directly upload them to gcs
for i in range(0,len(load_url)):
YouTube('load_url[i]').streams.first().download('C:/Users/Tushar/Documents/Serato_Video_Intelligence/youtube_videos')
它仅适用于前两个视频网址,然后因以下错误而失败
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "C:\Python37\lib\site-packages\pytube\streams.py", line 217, in download
bytes_remaining = self.filesize
File "C:\Python37\lib\site-packages\pytube\streams.py", line 164, in filesize
headers = request.get(self.url, headers=True)
File "C:\Python37\lib\site-packages\pytube\request.py", line 21, in get
response = urlopen(url)
File "C:\Python37\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Python37\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Python37\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python37\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Python37\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Python37\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
我希望是否有人可以帮助我了解这里出了什么问题,如果可以帮助我解决这个问题。我迫切需要这个,并且已经有一段时间无法解决这个问题了。
非常感谢提前!
PS如果可能的话,有没有办法直接将它们上传到gcs。