我是 Python 编程的新手。我在我的 Python 文件中使用以下代码:
import gethtml
import articletext
url = "http://www.thehindu.com/news/national/india-calls-for-resultoriented-steps-at-asem/article5339414.ece"
result = articletext.getArticle(url)
text_file = open("Output.txt", "w")
text_file.write(result)
text_file.close()
该文件articletext.py
包含以下代码:
from bs4 import BeautifulSoup
import gethtml
def getArticleText(webtext):
articletext = ""
soup = BeautifulSoup(webtext)
for tag in soup.findAll('p'):
articletext += tag.contents[0]
return articletext
def getArticle(url):
htmltext = gethtml.getHtmlText(url)
return getArticleText(htmltext)
但我收到以下错误:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 473: ordinal not in range(128)
To print the result into the output file, what proper code should I write ?
The output `result` is text in the form of a paragraph.