3

I'm using the pafy module to retrieve URLs for audio of youtube videos. Pafy itself uses the youtube-dl module to connect to youtube. The video Id is in the url part: '/watch?v=videoID'

I store the videoId of those videos but from time to time a video isn't available anymore and therefore I need to make a check. The checks I tried to implement fail to catch some edge cases, like a copyright ban for my country on this video.

I already tried 2 things. Using youtube oEmbed API and using the official youtube API.

import requests


YT_API_KEY='XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'

answers = []
answers2 = []

def isValidVid(vId):

    add = f'https://www.youtube.com/oembed?format=json&url=https://www.youtube.com/watch?v={str(vId)}'
    answer = requests.get(add)

    answers.append(answer)

    state = answer.ok
    if not state:
        print(f'{vId} has become invalid ')
    return state



def isValidVid2(vId):

    add = f'https://www.googleapis.com/youtube/v3/videos?part=id&id={vId}&key={YT_API_KEY}'
    answer = requests.get(add)

    answers2.append(answer)

    state = answer.ok
    if not state:
        print(f'{vId} has become invalid ')
    return state

vIds=['x26LZrX_vuI', #copyright blocked in my country
    'tRqCOIsTx8M', #works fine
    ]

for vid in vIds:
    isValidVid(vid)
    isValidVid2(vid)

for a,b in zip(answers,answers2):
    print(a.content.split(','),'\n')
    print(b.content.split(','),'\n'*2)

Runnig this snipped results in the following output(sorry looks ugly):

['b\'{"provider_name":"YouTube"', '"thumbnail_url":"https:\\\\/\\\\/i.ytimg.com\\\\/vi\\\\/x26LZrX_vuI\\\\/hqdefault.jpg"', '"type":"video"', '"thumbnail_width":480', '"width":459', '"version":"1.0"', '"thumbnail_height":360', '"author_url":"https:\\\\/\\\\/www.youtube.com\\\\/user\\\\/whiteassboy9"', '"author_name":"whiteassboy9"', '"height":344', '"provider_url":"https:\\\\/\\\\/www.youtube.com\\\\/"', '"html":"\\\\u003ciframe width=\\\\"459\\\\" height=\\\\"344\\\\" src=\\\\"https:\\\\/\\\\/www.youtube.com\\\\/embed\\\\/x26LZrX_vuI?feature=oembed\\\\" frameborder=\\\\"0\\\\" allow=\\\\"accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture\\\\" allowfullscreen\\\\u003e\\\\u003c\\\\/iframe\\\\u003e"', '"title":"Cradle Of Filth Hallowed Be Thy Name Lyrics"}\'']

['b\'{\\n "kind": "youtube#videoListResponse"', '\\n "etag": "\\\\"Bdx4f4ps3xCOOo1WZ91nTLkRZ_c/0OApMzmvJ5MISdLQorUKdgn8QAI\\\\""', '\\n "pageInfo": {\\n  "totalResults": 1', '\\n  "resultsPerPage": 1\\n }', '\\n "items": [\\n  {\\n   "kind": "youtube#video"', '\\n   "etag": "\\\\"Bdx4f4ps3xCOOo1WZ91nTLkRZ_c/Sdc-dVIem9K8Yo5a4co2ZdNy_b8\\\\""', '\\n   "id": "x26LZrX_vuI"\\n  }\\n ]\\n}\\n\'']


['b\'{"version":"1.0"', '"provider_name":"YouTube"', '"thumbnail_url":"https:\\\\/\\\\/i.ytimg.com\\\\/vi\\\\/tRqCOIsTx8M\\\\/hqdefault.jpg"', '"author_url":"https:\\\\/\\\\/www.youtube.com\\\\/user\\\\/Kikuku94"', '"width":459', '"thumbnail_height":360', '"provider_url":"https:\\\\/\\\\/www.youtube.com\\\\/"', '"html":"\\\\u003ciframe width=\\\\"459\\\\" height=\\\\"344\\\\" src=\\\\"https:\\\\/\\\\/www.youtube.com\\\\/embed\\\\/tRqCOIsTx8M?feature=oembed\\\\" frameborder=\\\\"0\\\\" allow=\\\\"accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture\\\\" allowfullscreen\\\\u003e\\\\u003c\\\\/iframe\\\\u003e"', '"height":344', '"type":"video"', '"thumbnail_width":480', '"author_name":"Kikuku94"', '"title":"Metallica - Disposable Heroes (Studio Version)"}\''] 

['b\'{\\n "kind": "youtube#videoListResponse"', '\\n "etag": "\\\\"Bdx4f4ps3xCOOo1WZ91nTLkRZ_c/HP3fEMhAlfsUHWjDdwxAE4mcxLg\\\\""', '\\n "pageInfo": {\\n  "totalResults": 1', '\\n  "resultsPerPage": 1\\n }', '\\n "items": [\\n  {\\n   "kind": "youtube#video"', '\\n   "etag": "\\\\"Bdx4f4ps3xCOOo1WZ91nTLkRZ_c/h2d05hbW0K56lbhpO-q8FN9qVdU\\\\""', '\\n   "id": "tRqCOIsTx8M"\\n  }\\n ]\\n}\\n\'']

If I use pafy to now retrieve the url to the audio resource after using either of the validationfunctions it raises an Error withing youtube-dl which I recognize as the same error you'd get when providing an invalid video Id:

mrl = pafy.new(vIds[0]).url

OSError: ERROR: HTTP is not supported.

I assume that there a more pitfalls than copyright bans, like private videos, if you got some tricks I'm listening!

Thanks for helping !

4

0 回答 0