186

所以我正在尝试制作这个程序,它会要求用户输入并将值存储在数组/列表中。
然后当输入一个空行时,它会告诉用户这些值中有多少是唯一的。
我是出于现实生活的原因而不是作为问题集来构建它。

enter: happy
enter: rofl
enter: happy
enter: mpg8
enter: Cpp
enter: Cpp
enter:
There are 4 unique words!

我的代码如下:

# ask for input
ipta = raw_input("Word: ")

# create list 
uniquewords = [] 
counter = 0
uniquewords.append(ipta)

a = 0   # loop thingy
# while loop to ask for input and append in list
while ipta: 
  ipta = raw_input("Word: ")
  new_words.append(input1)
  counter = counter + 1

for p in uniquewords:

..这就是我到目前为止所得到的一切。
我不确定如何计算列表中的唯一单词数?
如果有人可以发布解决方案,以便我可以从中学习,或者至少向我展示它会如何,谢谢!

4

16 回答 16

339

此外,使用collections.Counter重构您的代码:

from collections import Counter

words = ['a', 'b', 'c', 'a']

Counter(words).keys() # equals to list(set(words))
Counter(words).values() # counts the elements' frequency

输出:

['a', 'c', 'b']
[2, 1, 1]
于 2012-09-05T19:04:36.810 回答
282

您可以使用集合删除重复项,然后使用len函数计算集合中的元素:

len(set(new_words))
于 2012-09-05T13:14:32.520 回答
59

values, counts = np.unique(words, return_counts=True)

更多详情

import numpy as np

words = ['b', 'a', 'a', 'c', 'c', 'c']
values, counts = np.unique(words, return_counts=True)

函数numpy.unique返回输入列表中排序的唯一元素及其计数:

['a', 'b', 'c']
[2, 1, 3]
于 2018-07-08T04:06:56.987 回答
19

使用一

words = ['a', 'b', 'c', 'a']
unique_words = set(words)             # == set(['a', 'b', 'c'])
unique_word_count = len(unique_words) # == 3

有了这个,您的解决方案可能很简单:

words = []
ipta = raw_input("Word: ")

while ipta:
  words.append(ipta)
  ipta = raw_input("Word: ")

unique_word_count = len(set(words))

print "There are %d unique words!" % unique_word_count
于 2012-09-05T13:15:30.907 回答
10
aa="XXYYYSBAA"
bb=dict(zip(list(aa),[list(aa).count(i) for i in list(aa)]))
print(bb)
# output:
# {'X': 2, 'Y': 3, 'S': 1, 'B': 1, 'A': 2}
于 2019-08-14T18:05:56.187 回答
5

对于 ndarray 有一个名为unique的 numpy 方法:

np.unique(array_name)

例子:

>>> np.unique([1, 1, 2, 2, 3, 3])
array([1, 2, 3])
>>> a = np.array([[1, 1], [2, 3]])
>>> np.unique(a)
array([1, 2, 3])

对于 Series 有一个函数调用value_counts()

Series_name.value_counts()
于 2017-11-08T13:21:55.877 回答
3

如果你想要一个唯一值的直方图,这里是 oneliner

import numpy as np    
unique_labels, unique_counts = np.unique(labels_list, return_counts=True)
labels_histogram = dict(zip(unique_labels, unique_counts))
于 2020-11-24T17:22:29.157 回答
2

怎么样:

import pandas as pd
#List with all words
words=[]

#Code for adding words
words.append('test')


#When Input equals blank:
pd.Series(words).nunique()

它返回列表中有多少个唯一值

于 2020-07-31T10:58:55.623 回答
1
ipta = raw_input("Word: ") ## asks for input
words = [] ## creates list
unique_words = set(words)
于 2012-09-05T13:20:51.353 回答
1

尽管 set 是最简单的方法,但您也可以使用 dict 和 usesome_dict.has(key)来填充仅具有唯一键和值的字典。

假设您已经填充words[]了来自用户的输入,请创建一个 dict 将列表中的唯一单词映射到一个数字:

word_map = {}
i = 1
for j in range(len(words)):
    if not word_map.has_key(words[j]):
        word_map[words[j]] = i
        i += 1                                                             
num_unique_words = len(new_map) # or num_unique_words = i, however you prefer
于 2016-06-10T17:54:33.967 回答
1

您可以使用get方法:

lst = ['a', 'b', 'c', 'c', 'c', 'd', 'd']

dictionary = {}
for item in lst:
    dictionary[item] = dictionary.get(item, 0) + 1
    
print(dictionary)

输出:

{'a': 1, 'b': 1, 'c': 3, 'd': 2}
于 2021-07-13T18:00:59.387 回答
1

使用熊猫的其他方法

import pandas as pd

LIST = ["a","a","c","a","a","v","d"]
counts,values = pd.Series(LIST).value_counts().values, pd.Series(LIST).value_counts().index
df_results = pd.DataFrame(list(zip(values,counts)),columns=["value","count"])

然后,您可以以您想要的任何格式导出结果

于 2019-08-14T13:52:36.767 回答
0

我自己会使用一套,但这里还有另一种方式:

uniquewords = []
while True:
    ipta = raw_input("Word: ")
    if ipta == "":
        break
    if not ipta in uniquewords:
        uniquewords.append(ipta)
print "There are", len(uniquewords), "unique words!"
于 2012-09-05T13:31:41.717 回答
0
ipta = raw_input("Word: ") ## asks for input
words = [] ## creates list

while ipta: ## while loop to ask for input and append in list
  words.append(ipta)
  ipta = raw_input("Word: ")
  words.append(ipta)
#Create a set, sets do not have repeats
unique_words = set(words)

print "There are " +  str(len(unique_words)) + " unique words!"
于 2012-09-05T13:31:57.910 回答
0

以下应该工作。lambda 函数过滤掉重复的单词。

inputs=[]
input = raw_input("Word: ").strip()
while input:
    inputs.append(input)
    input = raw_input("Word: ").strip()
uniques=reduce(lambda x,y: ((y in x) and x) or x+[y], inputs, [])
print 'There are', len(uniques), 'unique words'
于 2012-09-05T13:26:55.623 回答
0

这是我自己的版本

def unique_elements():
    elem_list = []
    dict_unique_word = {}
    for i in range(5):# say you want to check for unique words from five given words
        word_input = input('enter element: ')
        elem_list.append(word_input)
        if word_input not in dict_unique_word:
            dict_unique_word[word_input] = 1
        else:
            dict_unique_word[word_input] += 1
    return elem_list, dict_unique_word
result_1, result_2 = unique_elements() 
# result_1 holds the list of all inputted elements
# result_2 contains unique words with their count
print(result_2)
于 2021-09-02T12:44:25.603 回答