0

我对探索性分析很陌生,但我创建了一个情绪分析

df['polarity'] = df['Comment'].apply(lambda x: TextBlob(x).sentiment.polarity)

我为数据框中最常见的单词创建了 ngram

def get_top_n_words(corpus, n=None):
    vec = CountVectorizer().fit(corpus)
    bag_of_words = vec.transform(corpus)
    sum_words = bag_of_words.sum(axis=0) 
    words_freq = [(word, sum_words[0, idx]) for word, idx in vec.vocabulary_.items()]
    words_freq =sorted(words_freq, key = lambda x: x[1], reverse=True)
    return words_freq[:n]
common_words = get_top_n_words(df['Comment'], 20)
for word, freq in common_words:
    print(word, freq)
df1 = pd.DataFrame(common_words, columns = ['Comment' , 'count'])
df1.groupby('Comment').sum()['count'].sort_values(ascending=False).iplot(
    kind='bar', yTitle='Count', color='blue', title='Top 20 Words in Comments Before Removing Stop Words')

如何隔离负极性(<0)文本并创建仅分析负面情绪文本的 ngram?

4

0 回答 0