我想提取 YouTube 视频的标题。我怎样才能做到这一点?
谢谢。
Easiest way to obtain information about a youtube video afaik is to parse the string retrieved from: http://youtube.com/get_video_info?video_id=XXXXXXXX
Using something like PHP's parse_str(), you can obtain a nice array of nearly anything about the video:
$content = file_get_contents("http://youtube.com/get_video_info?video_id=".$id);
parse_str($content, $ytarr);
echo $ytarr['title'];
That will print the title for the video using $id as the video's id.
你好在python3中我建立了2种方式
1) 没有 API KEY
import urllib.request
import json
import urllib
import pprint
#change to yours VideoID or change url inparams
VideoID = "SZj6rAYkYOg"
params = {"format": "json", "url": "https://www.youtube.com/watch?v=%s" % VideoID}
url = "https://www.youtube.com/oembed"
query_string = urllib.parse.urlencode(params)
url = url + "?" + query_string
with urllib.request.urlopen(url) as response:
response_text = response.read()
data = json.loads(response_text.decode())
pprint.pprint(data)
print(data['title'])
示例结果:
{'author_name': 'Google Developers',
'author_url': 'https://www.youtube.com/user/GoogleDevelopers',
'height': 270,
'html': '<iframe width="480" height="270" '
'src="https://www.youtube.com/embed/SZj6rAYkYOg?feature=oembed" '
'frameborder="0" allow="autoplay; encrypted-media" '
'allowfullscreen></iframe>',
'provider_name': 'YouTube',
'provider_url': 'https://www.youtube.com/',
'thumbnail_height': 360,
'thumbnail_url': 'https://i.ytimg.com/vi/SZj6rAYkYOg/hqdefault.jpg',
'thumbnail_width': 480,
'title': 'Google I/O 101: Google APIs: Getting Started Quickly',
'type': 'video',
'version': '1.0',
'width': 480}
Google I/O 101: Google APIs: Getting Started Quickly
2) 使用 Google API - 需要 APIKEY
import urllib.request
import json
import urllib
import pprint
APIKEY = "YOUR_GOOGLE_APIKEY"
VideoID = "YOUR_VIDEO_ID"
params = {'id': VideoID, 'key': APIKEY,
'fields': 'items(id,snippet(channelId,title,categoryId),statistics)',
'part': 'snippet,statistics'}
url = 'https://www.googleapis.com/youtube/v3/videos'
query_string = urllib.parse.urlencode(params)
url = url + "?" + query_string
with urllib.request.urlopen(url) as response:
response_text = response.read()
data = json.loads(response_text.decode())
pprint.pprint(data)
print("TITLE: %s " % data['items'][0]['snippet']['title'])
示例结果:
{'items': [{'id': 'SZj6rAYkYOg',
'snippet': {'categoryId': '28',
'channelId': 'UC_x5XG1OV2P6uZZ5FSM9Ttw',
'title': 'Google I/O 101: Google APIs: Getting '
'Started Quickly'},
'statistics': {'commentCount': '36',
'dislikeCount': '20',
'favoriteCount': '0',
'likeCount': '418',
'viewCount': '65783'}}]}
TITLE: Google I/O 101: Google APIs: Getting Started Quickly
使用 JavaScript 数据 API:
var loadInfo = function (videoId) {
var gdata = document.createElement("script");
gdata.src = "http://gdata.youtube.com/feeds/api/videos/" + videoId + "?v=2&alt=jsonc&callback=storeInfo";
var body = document.getElementsByTagName("body")[0];
body.appendChild(gdata);
};
var storeInfo = function (info) {
console.log(info.data.title);
};
然后你只需要打电话loadInfo(videoId)
。
API 文档中提供了更多信息。
我将按照YouTube API v3 文档中的概述来布置流程。
在https://console.developers.google.com/apis/credentials创建一个新项目。
制作一个新的 API 密钥。您需要它来访问 v3 下的视频信息。
https://www.googleapis.com/youtube/v3/videos?id=<YOUR VIDEO ID HERE>&key=<YOUR API KEY HERE>%20&part=snippet
无尖括号)
fields
和part
参数是这里的关键。嗯,你可以通过URL
浏览器访问哪个 URL 来查看它。作为回报,您应该得到API response:
.
URL: https://www.googleapis.com/youtube/v3/videos?id=7lCDEYXw3mM&key=YOUR_API_KEY
&fields=items(id,snippet(channelId,title,categoryId),statistics)&part=snippet,statistics
Description: This example modifies the fields parameter from example 3
so that in the API response, each video resource's snippet
object only includes the channelId, title,
and categoryId properties.
API response:
{
"videos": [
{
"id": "7lCDEYXw3mM",
"snippet": {
"channelId": "UC_x5XG1OV2P6uZZ5FSM9Ttw",
"title": "Google I/O 101: Q&A On Using Google APIs",
"categoryId": "28"
},
"statistics": {
"viewCount": "3057",
"likeCount": "25",
"dislikeCount": "0",
"favoriteCount": "17",
"commentCount": "12"
}
}
]
}
.json
这将为您提供文件格式的视频信息。如果您的项目是通过 JavaScript 访问此信息,那么您接下来可能会去这里:如何在 Javascript 中从 URL 获取 JSON?.
我相信最好的方法是使用youTube的gdata,然后从返回的XML中获取信息
http://gdata.youtube.com/feeds/api/videos/6_Ukfpsb8RI
更新:现在有一个更新的 API,你应该使用它
https://developers.google.com/youtube/v3/getting-started
URL: https://www.googleapis.com/youtube/v3/videos?id=7lCDEYXw3mM&key=YOUR_API_KEY
&fields=items(id,snippet(channelId,title,categoryId),statistics)&part=snippet,statistics
Description: This example modifies the fields parameter from example 3 so that in the API response, each video resource's snippet object only includes the channelId, title, and categoryId properties.
API response:
{
"videos": [
{
"id": "7lCDEYXw3mM",
"snippet": {
"channelId": "UC_x5XG1OV2P6uZZ5FSM9Ttw",
"title": "Google I/O 101: Q&A On Using Google APIs",
"categoryId": "28"
},
"statistics": {
"viewCount": "3057",
"likeCount": "25",
"dislikeCount": "0",
"favoriteCount": "17",
"commentCount": "12"
}
}
]
}
// This is the youtube video URL: http://www.youtube.com/watch?v=nOHHta68DdU
$code = "nOHHta68DdU";
// Get video feed info (xml) from youtube, but only the title | http://php.net/manual/en/function.file-get-contents.php
$video_feed = file_get_contents("http://gdata.youtube.com/feeds/api/videos?v=2&q=".$code."&max-results=1&fields=entry(title)&prettyprint=true");
// xml to object | http://php.net/manual/en/function.simplexml-load-string.php
$video_obj = simplexml_load_string($video_feed);
// Get the title string to a variable
$video_str = $video_obj->entry->title;
// Output
echo $video_str;
使用 bash、wget 和 lynx:
#!/bin/bash
read -e -p "Youtube address? " address
page=$(wget "$address" -O - 2>/dev/null)
title=$(echo "$page" | grep " - ")
title="$(lynx --dump -force-html <(echo "<html><body>
$title
</body></html>")| grep " - ")"
title="${title/* - /}"
echo "$title"
如果赞赏 python 批处理脚本:我使用BeautifulSoup轻松解析 HTML 中的标题,使用urllib下载 HTML 和unicodecsv库以保存 Youtube 标题中的所有字符。
您唯一需要做的就是将带有单个(命名)列 url的 csv 与Youtube 视频的 URL 放在与脚本相同的文件夹中,并将其命名为yt-urls.csv并运行脚本。您将获得包含 URL 及其标题的文件yt-urls-titles.csv 。
#!/usr/bin/python
from bs4 import BeautifulSoup
import urllib
import unicodecsv as csv
with open('yt-urls-titles.csv', 'wb') as f:
resultcsv = csv.DictWriter(f, delimiter=';', quotechar='"',fieldnames=['url','title'])
with open('yt-urls.csv', 'rb') as f:
inputcsv = csv.DictReader(f, delimiter=';', quotechar='"')
resultcsv.writeheader()
for row in inputcsv:
soup = BeautifulSoup(urllib.urlopen(row['url']).read(), "html.parser")
resultcsv.writerow({'url': row['url'],'title': soup.title.string})
使用 python 我明白了import pafy url = "https://www.youtube.com/watch?v=bMt47wvK6u0" video = pafy.new(url) print(video.title)
以下是 ColdFusion 的一些剪切和粘贴代码:
http://trycf.com/gist/f296d14e456a7c925d23a1282daa0b90
它使用需要 API 密钥的 YouTube API v3 在 CF9(可能还有更早的版本)上运行。
我在其中留下了一些评论和诊断内容,供任何想要深入挖掘的人使用。希望它可以帮助某人。
您可以使用 Json 获取有关视频的所有信息
$jsonURL = file_get_contents("https://www.googleapis.com/youtube/v3/videos?id={Your_Video_ID_Here}&key={Your_API_KEY}8&part=snippet");
$json = json_decode($jsonURL);
$vtitle = $json->{'items'}[0]->{'snippet'}->{'title'};
$vdescription = $json->{'items'}[0]->{'snippet'}->{'description'};
$vvid = $json->{'items'}[0]->{'id'};
$vdate = $json->{'items'}[0]->{'snippet'}->{'publishedAt'};
$vthumb = $json->{'items'}[0]->{'snippet'}->{'thumbnails'}->{'high'}->{'url'};
我希望它能解决你的问题。
如果您熟悉 java,请尝试使用 Jsoup 解析器。
Document document = Jsoup.connect("http://www.youtube.com/ABDCEF").get();
document.title();
我在这里对波尔图的优秀答案进行了一点改造,并在 Python 上写了这个片段:
import urllib, urllib.request, json
input = "C:\\urls.txt"
output = "C:\\tracks.csv"
urls=[line.strip() for line in open(input)]
for url in urls:
ID = url.split('=')
VideoID = ID[1]
params = {"format": "json", "url": "https://www.youtube.com/watch?v=%s" % VideoID}
url = "https://www.youtube.com/oembed"
query_string = urllib.parse.urlencode(params)
url = url + "?" + query_string
with urllib.request.urlopen(url) as response:
response_text = response.read()
try:
data = json.loads(response_text.decode())
except ValueError as e:
continue # skip faulty url
if data is not None:
author = data['author_name'].split(' - ')
author = author[0].rstrip()
f = open(output, "a", encoding='utf-8')
print(author, ',', data['title'], sep="", file=f)
它吃一个带有 Youtube URL 列表的纯文本文件:
https://www.youtube.com/watch?v=F_Vfgdfgg
https://www.youtube.com/watch?v=RndfgdfN8
...
并返回一个包含 Artist-Title 对的 CSV 文件:
Beyonce,Pretty hurts
Justin Timberlake,Cry me a river
有两个模块可以帮助您,它们是pafy和 y outube-dl。首先使用 pip 安装这个模块。Pafy 在后台使用 youtube-dl 来获取视频信息,您也可以使用 pafy 和 youtube-dl 下载视频。
pip install youtube_dl
pip install pafy
现在您需要遵循此代码,我假设您拥有任何 youtube 视频的 URL。
import pafy
def fetch_yt_video(link):
video = pafy.new(link)
print('Video Title is: ',video.title)
fetch_yt_video('https://youtu.be/CLUsplI4xMU')
输出是
Video Title is: Make the perfect resume | For freshers & experienced | Step by step tutorial with free format
与 Matej M 类似,但更简单:
import requests
from bs4 import BeautifulSoup
def get_video_name(id: str):
"""
Return the name of the video as it appears on YouTube, given the video id.
"""
r = requests.get(f'https://youtube.com/watch?v={id}')
r.raise_for_status()
soup = BeautifulSoup(r.content, "lxml")
return soup.title.string
if __name__ == '__main__':
js = get_video_name("RJqimlFcJsM")
print('\n\n')
print(js)
试试这个,我在播放列表中获取每个视频的名称和网址,您可以根据需要修改此代码。
$Playlist = ((Invoke-WebRequest "https://www.youtube.com/watch?v=HKkRbc6W6NA&list=PLz9M61O0WZqSUvHzPHVVC4IcqA8qe5K3r&
index=1").Links | Where {$_.class -match "playlist-video"}).href
$Fname = ((Invoke-WebRequest "https://www.youtube.com/watch?v=HKkRbc6W6NA&list=PLz9M61O0WZqSUvHzPHVVC4IcqA8qe5K3r&ind
ex=1").Links | Where {$_.class -match "playlist-video"}).outerText
$FinalText=""
For($i=0;$i -lt $playlist.Length;$i++)
{
Write-Output("'"+($Fname[$i].split("|")[0]).split("|")[0]+"'+"+"https://www.youtube.com"+$Playlist[$i])
}