该网站的HTML如下
<html>
<head>
<title>FAS-ESS web server</title>
</head>
<body>
<body bgcolor="#00336677" link="#FFFF00" vlink="#00FFFF" alink="#00FF00" text="#FFFFFF">
<h1><center>FAS-ESS web server</center></h1>
<p><center>(provided by the <a href="http://genes.mit.edu/burgelab/">Burge Lab</a>) </center></p>
<form action="http://genes.mit.edu/cgi-bin/fas-ess.pl" method="post">
<input type="radio" name="set" value="FAS-hex2" checked>FAS-hex2
(<a href="fas-hex2.txt">set</a>)<br />
<input type="radio" name="set" value="FAS-hex3">FAS-hex3
(<a href="fas-hex3.txt">set</a>)
<p>Sequence(s):<br />
<textarea name="sequence" rows="12" cols="72"></textarea><br />
<input type="reset" value="Clear">
<input type="submit" value="Submit">
</p>
</form>
<p>Notes:</p>
<ul>
<li>You can enter a single sequence or multiple sequences in FASTA format.</li>
<li>Non-letters will be removed from sequences.</li>
<li>Lowercase letters will be converted to uppercase.</li>
<li>T and U are considered the same.</li>
</ul>
<p>Reference:<br />
Wang, Z., Rolish, M. E., Yeo, G., Tung, V., Mawson, M. and
Burge, C. B. (2004). Systematic identification and analysis of exonic
splicing silencers. <i>Cell</i> <b>119</b>, 831-845.</p>
<p>Please send feedback to Mike Rolish (merolish at mit dot edu).</p>
<p><a href="http://genes.mit.edu/burgelab/">Burge Lab home</a></p>
</body>
</html>
这是我的python代码:
import os
import sys
import urllib
import urllib.parse
import urllib.request
site = "http://genes.mit.edu/fas-ess/"
def getinfo(info):
form_data = {'sequence':info}
params = urllib.parse.urlencode(form_data)
request = urllib.request.Request(site,bytes(params,encoding='UTF-8'))
response = urllib.request.urlopen(request)
print (response.read().decode('utf-8'))
if __name__ == "__main__":
info = '>NM_000015\nATGGACATTGAAGCATATTTTGAAAGAATTGGCTATAAGAACTCTAGGAACAAATTGGACTTGGAAACATTAACTGACATTCTTGAGCACCAGATCCGGGCTGTTCCCTTTGAGAACCTTAACATGCATTGTGGGCAAGCCATGGAGTTGGGCTTAGAGGCTATTTTTGATCACATTGTAAGAAGAAACCGGGGTGGGTGGTGTCTCCAGGTCAATCAACTTCTGTACTGGGCTCTGACCACAATCGGTTTTCAGACCACAATGTTAGGAGGGTATTTTTACATCCCTCCAGTTAACAAATACAGCACTGGCATGGTTCACCTTCTCCTGCAGGTGACCATTGACGGCAGGAATTACATTGTCGATGCTGGGTCTGGAAGCTCCTCCCAGATGTGGCAGCCTCTAGAATTAATTTCTGGGAAGGATCAGCCTCAGGTGCCTTGCATTTTCTGCTTGACAGAAGAGAGAGGAATCTGGTACCTGGACCAAATCAGGAGAGAGCAGTATATTACAAACAAAGAATTTCTTAATTCTCATCTCCTGCCAAAGAAGAAACACCAAAAAATATACTTATTTACGCTTGAACCTCGAACAATTGAAGATTTTGAGTCTATGAATACATACCTGCAGACGTCTCCAACATCTTCATTTATAACCACATCATTTTGTTCCTTGCAGACCCCAGAAGGGGTTTACTGTTTGGTGGGCTTCATCCTCACCTATAGAAAATTCAATTATAAAGACAATACAGATCTGGTCGAGTTTAAAACTCTCACTGAGGAAGAGGTTGAAGAAGTGCTGAGAAATATATTTAAGATTTCCTTGGGGAGAAATCTCGTGCCCAAACCTGGTGATGGATCCCTTACTATTTAG'
getinfo(info)
完整解释:
我要做的是将给定的字符串输入网站,提交,然后抓取结果页面。我曾尝试使用 urllib2 在 3.0 之前使用 python 翻译另一个线程,但我得到的唯一回报是原始网站 html。
谢谢你看看。
我会邀请您试用该网站: http:
//genes.mit.edu/fas-ess/
用这个查询
">NM_000015GTGATGGATCCCTTACTATTTAG"