0

我正在尝试运行 twint 搜索以检索推文列表,我在其上执行情绪分析。我创建了一个 for 循环,它遍历 pandas 日期数据框并使用给定的日期参数运行 twint 搜索。

这是我的代码:

import twint
import pandas
from textblob import TextBlob

# Functions
def twint_to_pandas(columns): #Creds to Favio Vazques
    return twint.output.panda.Tweets_df[columns]

def getTweets(st, startDate, endDate): #runs a twint search and returns a pandas df
    c = twint.Config()
    c.Search= str(st)
    c.Limit = 20
    c.Lang = "en"
    c.Since = startDate
    c.Until = endDate
    #c.Verified = True
    c.Hide_output = True
    c.Pandas = True
    
    twint.run.Search(c)
    
    df = twint_to_pandas(["date", "username", "tweet"])
    
    return df

def getSentiScore(string):
    t = TextBlob(str(string)) #create a textblob class instance
    score = t.sentiment.polarity # get sentiment
    return score #pass it to next function

def getAverageScore(st, startDate, endDate):
    df = getTweets(st, startDate, endDate) #establish a variable for the fetched tweets
    
    results = [getSentiScore(str(x)) for x in df['tweet']] #list comprehension
    
    resultsDf = pandas.DataFrame(results, columns=['sentiScore']).dropna() #create dataframe for it
    
    mean = resultsDf['sentiScore'].mean() #get a mean sentiment score
    #median = resultsDf['sentiScore'].median()
    #mode = resultsDf['sentiScore'].mode()
    
    print("Mean" + str(mean)) # print the mean
    #print("Median" + str(median))
    #print("Mode" + str(mode))

    
def weeklyScoreToCSV(st, startDate, days):
    datetime = pandas.date_range(start=(str(startDate)), freq='D', periods=days, closed='left')
    datetimeDf = datetime.to_frame(index=False, name='date')
    datesDf = [i for i in (datetimeDf['date'])]
    dateLength = int(len(datesDf)-1)
    for i in range(0, dateLength):
        sentiScore = getAverageScore(st, str(datesDf[i]), str(datesDf[i+1]))
        #print(str(datesDf[i]) + str(datesDf[i+1]))
    
# Execution
#getAverageScore("Obama")
weeklyScoreToCSV("a", '01/01/2019', 10)

在weeklyScoreToCSV 函数中,每当我手动输入getAverageScore 函数调用的日期参数时,该函数都能完美运行。但是,当我尝试使用给定的代码时,

我收到以下错误

KeyError: "None of [Index(['date', 'username', 'tweet'], dtype='object')] are in the [columns]"

我无法弄清楚我哪里出错了。

4

1 回答 1

1

有类似的问题。使用 twitter 的高级搜索功能将搜索修改为searchstr = "(search string) until:2021-02-19 since:2021-02-17)"

我建议使用 Twitter 网站上的所有高级搜索语法,并将其包含在c.Search = searchstr而不是包含其他参数c

于 2021-03-21T10:35:33.793 回答