python - 如何计算句子中的单词数，忽略数字、标点符号和空格？

Question

我将如何计算一个句子中的单词？我正在使用 Python。

例如，我可能有以下字符串：

string = "I     am having  a   very  nice  23!@$      day. "

那将是7个字。我在每个单词之后/之前以及何时涉及数字或符号时遇到了随机数量的空格问题。

score 103 · Accepted Answer

str.split()没有任何参数在空白字符的运行中拆分：

>>> s = 'I am having a very nice day.'
>>> 
>>> len(s.split())
7

从链接的文档中：

如果sep未指定或 is None，则应用不同的拆分算法：连续空格的运行被视为单个分隔符，如果字符串具有前导或尾随空格，则结果将在开头或结尾不包含空字符串。

score 61 · Accepted Answer

您可以使用regex.findall()：

import re
line = " I am having a very nice day."
count = len(re.findall(r'\w+', line))
print (count)

score 7 · Accepted Answer

s = "I     am having  a   very  nice  23!@$      day. "
sum([i.strip(string.punctuation).isalpha() for i in s.split()])

上面的语句将遍历每个文本块并删除标点符号，然后再验证该块是否真的是字母串。

score 5 · Accepted Answer

这是一个使用正则表达式的简单单词计数器。该脚本包含一个循环，您可以在完成后终止它。

#word counter using regex
import re
while True:
    string =raw_input("Enter the string: ")
    count = len(re.findall("[a-zA-Z_]+", string))
    if line == "Done": #command to terminate the loop
        break
    print (count)
print ("Terminated")

score 4 · Accepted Answer

好的，这是我这样做的版本。我注意到您希望输出为7，这意味着您不想计算特殊字符和数字。所以这里是正则表达式模式：

re.findall("[a-zA-Z_]+", string)

其中[a-zA-Z_]意味着它将匹配任何字符 beetwen a-z（小写）和A-Z（大写）。

关于空间。如果要删除所有多余的空格，只需执行以下操作：

string = string.rstrip().lstrip() # Remove all extra spaces at the start and at the end of the string
while "  " in string: # While  there are 2 spaces beetwen words in our string...
    string = string.replace("  ", " ") # ... replace them by one space!

score 4 · Accepted Answer

    def wordCount(mystring):  
        tempcount = 0  
        count = 1  

        try:  
            for character in mystring:  
                if character == " ":  
                    tempcount +=1  
                    if tempcount ==1:  
                        count +=1  

                    else:  
                        tempcount +=1
                 else:
                     tempcount=0

             return count  

         except Exception:  
             error = "Not a string"  
             return error  

    mystring = "I   am having   a    very nice 23!@$      day."           

    print(wordCount(mystring))

输出为 8

score 3 · Accepted Answer

如何使用一个简单的循环来计算空格的出现次数！？

txt = "Just an example here move along" 
count = 1
for i in txt:
if i == " ":
   count += 1
print(count)

score 0 · Accepted Answer

import string 

sentence = "I     am having  a   very  nice  23!@$      day. "
# Remove all punctuations
sentence = sentence.translate(str.maketrans('', '', string.punctuation))
# Remove all numbers"
sentence = ''.join([word for word in sentence if not word.isdigit()])
count = 0;
for index in range(len(sentence)-1) :
    if sentence[index+1].isspace() and not sentence[index].isspace():
        count += 1 
print(count)

python - 如何计算句子中的单词数，忽略数字、标点符号和空格？

8 回答 8

Related

Reference