2

Hi I have created a python script using tweepy to stream tweets based on a keyword array into a mongodb collection based on the name of the element in the array that it was filtered by via pymongo ie (apple tweets saved to an apple collection). This script saves them in a JSON format and now I want to perform sentiment analysis on these saved tweets.

I have been reading a few tutorials on this and have decided to use the NaiveBayesClassifier built into the TextBlob module. I have created some train data and passed it into the classifier (just a normal text array with the sentiment at the end of each element) but I am unsure of how to apply this classifier to my already saved tweets. I think its like as below but this does not work as it throws an error:

Traceback (most recent call last):
  File "C:/Users/Philip/PycharmProjects/FinalYearProject/TrainingClassification.py", line 25, in <module>
    cl = NaiveBayesClassifier(train)
  File "C:\Python27\lib\site-packages\textblob\classifiers.py", line 192, in __init__
    self.train_features = [(self.extract_features(d), c) for d, c in self.train_set]
ValueError: too many values to unpack

Here is my code so far:

from textblob.classifiers import NaiveBayesClassifier
import pymongo

train = [
    'I love this sandwich.', 'pos',
    'I feel very good about these beers.', 'pos',
    'This is my best work.', 'pos',
    'What an awesome view", 'pos',
    'I do not like this restaurant', 'neg',
    'I am tired of this stuff.', 'neg',
    'I can't deal with this', 'neg',
    'He is my sworn enemy!', 'neg',
    'My boss is horrible.', 'neg'
]

cl = NaiveBayesClassifier(train)
conn = pymongo.MongoClient('localhost', 27017)
db = conn.TwitterDB

appleSentiment = cl.classify(db.Apple)
print ("Sentiment of Tweets about Apple is " + appleSentiment)

Any help would be greatly appreciated.

4

2 回答 2

1

引用文档

分类:对一串文本进行分类。

但相反,您将其传递给一个集合。db.Apple是一个集合而不是字符串文本。

appleSentiment = cl.classify(db.Apple)
                              ^

您需要编写一个查询并使用您的查询结果作为参数classify 例如要查找任何特定的推文可以使用find_one。有关更多信息,文档是您的朋友。

于 2015-03-10T17:30:57.353 回答
0

以下是使用 TextBlob 和 PyMongo 进行情绪分析的方法:

from textblob import TextBlob
import re

def clean_tweet(tweet):
    return ' '.join(re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t]) | (\w +:\ / \ / \S +)", " ", tweet).split())


def tweet_sentiment(tweet):
    tweet_analysis = TextBlob(clean_tweet(tweet))
    if tweet_analysis.polarity > 0:
        return 'positive'
    elif tweet_analysis.polarity == 0:
        return 'neutral'
    else:
        return 'positive'

for tweet in tweets:
    print(tweet_sentiment(tweet['text']), " sentiment for the tweet: ", tweet['text'])
于 2017-12-03T01:10:49.517 回答