我想对基于特定关键字获取的推文列表进行情感分析。进来的推文大多是荷兰语,TextBlob 需要将它们转换为英文,以便计算推文的极性和主观性值。如何将推文转换为英文?我基本上需要一个免费的 API 来进行翻译。使用 MS Bing 翻译器时遇到问题。我曾尝试使用goslate
、和库langdetect
,但它们都不起作用。这是我正在使用的代码:translate
translation
#!/usr/bin/env python
import tweepy
import goslate
from langdetect import detect
from translation import baidu, google, youdao, iciba
from translate import Translator
import os
import time
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
t=time.time()
#karan's api keys
consumer_key = 'xxx'
consumer_secret = 'xxx'
access_key = 'xxx'
access_secret = 'xxx'
gs=goslate.Goslate()
translator= Translator(to_lang="en")
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
search_results = api.search(q="football", count=2, geocode="52.132633,5.2912659999999505,300km")
f=open('tweets_football.txt','wb')
for i in range(0,len(search_results)):
try:
print search_results[i].text
print search_results[i].id
print search_results[i].user.screen_name
trans=search_results[i].text
#print(gs.translate(trans,'en'))
print(translator.translate(trans))
if search_results[i].text not in search_results:
f.write(search_results[i].text)
f.write("\n")
print "Written to file!"
except Exception as e:
print str(e)
f.close()
print time.time()-t
请指出我正确的方向。如果此过程有更简单的方法,请也提出建议。提前致谢。