我有这个html
文件,从Projekktor下载:
<!DOCTYPE HTML>
<html>
<head>
<title>Projekktor Version 8 Test</title>
<link rel="stylesheet" href="theme/style.css" type="text/css" media="screen" />
<script type="text/javascript" src="projekktor/jquery.min.js"></script> <!-- Load jquery -->
<script type="text/javascript" src="projekktor/projekktor.min.js"></script> <!-- load projekktor -->
</head>
<body>
<video id="player_a" class="projekktor" poster="intro.png" title="this is projekktor" width="640" height="360" controls>
<source src="" />
</video>
<script type="text/javascript">
$(document).ready(function() {
projekktor('#player_a', {
volume: 0.8,
playerFlashMP4: 'http://www.localhost:8000/StrobeMediaPlayback.swf',
playerFlashMP3: 'http://www.localhost:8000/StrobeMediaPlayback.swf'
});
});
</script>
</body>
</html>
然后我通过 API 调用(我有凭据)获取 youtube 视频的 url,以便src=''
用以下代码替换结果形式
import lxml.html as LH
link = youtube_call(id)
def parse_html(link):
filename = 'projekktor.html'
f = LH.parse(filename)
for el in f.iter('video'):
el.attrib['src'] = link
# have also tried
# el.attrib['src'] = link.replace('amp;', '')
new_html = LH.tostring(f, pretty_print=True)
print (new_html)
但是当我打印它时,一个讨厌的amp;
被添加到src=
,并且访问链接被拒绝。(出于可读性目的,我将此处的链接分解为换行符)
https://r3---sn-oxunxg8pjvn-bpbs.googlevideo.com/videoplayback?expire=1485418386&
amp;mv=m&
amp;mt=1485396620&
amp;ms=au&
amp;clen=13475559&
amp;mn=sn-oxunxg8pjvn-bpbs&
amp;mm=31&
amp;ipbits=0&
amp;requiressl=yes&
amp;itag=18&id=o-AG-dux-Jvtia_DsWZcyRfNpbMlzulsNn6I3SXyi0SI1B&
amp;lmt=1458188966300704&
amp;signature=BDC946187F74386CE00C5452CD703F9B13E4E30F.766549AB6A7C1811899CCC04742353B5BD0153D7&dur=266.448&key=yt6&
amp;ip=177.142.138.140&
amp;sparams=clen%2Cdur%2Cei%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Cratebypass%2Crequiressl%2Csource%2Cupn%2Cexpire&
amp;ei=MluJWO_aEIr_-AXHx6GwDA&
amp;mime=video%2Fmp4&
amp;upn=aFGwEwwIS1o&pl=20&source=youtube&
amp;ratebypass=yes&initcwndbps=1178750&
amp;gir=yes
全部删除amp;
,链接是有效的,但我试过link.replace('amp;', '')
不行。
有什么解决方法吗?