python - 每个句子的python平均短语

Question

给出了这两个函数。

def split_on_separators(original, separators):
""" (str, str) -> list of str

Return a list of non-empty, non-blank strings from the original string
determined by splitting the string on any of the separators.
separators is a string of single-character separators.

>>> split_on_separators("Hooray! Finally, we're done.", "!,")
['Hooray', ' Finally', " we're done."]
"""

# To do: Complete this function's body to meet its specification.
# You are not required to keep the two lines below but you may find
# them helpful. (Hint)
for i in separators:
    original = original.replace(i,"<*)))>{")
    ret = original.split("<*)))>{")
return ret

def clean_up(s):
""" (str) -> str

Return a new string based on s in which all letters have been
converted to lowercase and punctuation characters have been stripped 
from both ends. Inner punctuation is left untouched. 

>>> clean_up('Happy Birthday!!!')
'happy birthday'
>>> clean_up("-> It's on your left-hand side.")
" it's on your left-hand side"
"""

punctuation = """!"',;:.-?)([]<>*#\n\t\r"""
result = s.lower().strip(punctuation)
return result

我应该返回每个句子的平均短语数。这是我写的函数

def avg_sentence_complexity(text):
""" (list of str) -> float

Return the average number of phrases per sentence.

A sentence is defined as a non-empty string of non-terminating
punctuation surrounded by terminating punctuation
or beginning or end of file. Terminating punctuation is defined as !?.
Phrases are substrings of sentences, separated by one or more of the
following delimiters ,;: 

>>> text = ['The time has come, the Walrus said\n',
     'To talk of many things: of shoes - and ships - and sealing wax,\n',
     'Of cabbages; and kings.\n',
     'And why the sea is boiling hot;\n',
     'and whether pigs have wings.\n']
>>> avg_sentence_complexity(text)
3.5
"""

huge_str = ''
clean_sentences = []
for lines in text:
    huge_str += lines   
list_of_sentences = split_on_separators(huge_str, '?!.')    
for strings in list_of_sentences:
    cleaned = clean_up(strings)
    clean_sentences.append(cleaned) 
    if '' in clean_sentences:
        clean_sentences.remove('')  
num_sentences = len(clean_sentences)

large = ''
for phrases in text:
    large += phrases
list_of_phrases = split_on_separators(large, ',;:')
num_phrases = len(list_of_phrases)

asc =  num_phrases / num_sentences
return asc

这只会给我 3.0，即总短语除以总句子。我的问题是如何计算（第一句中的总短语）/（总句子）+（第二句中的总短语）/（总句子）+ ...

score 1 · Accepted Answer

我的意思是从技术上讲，正如您所描述的那样，您只是在计算1/total_sentances*num_phrases等于num_phrases/total_sentances，因为每个phrase都只是1我所理解的。

你真正想做的是计算每个句子中的短语数。然后，您可以使用numpy.mean短语计数列表来查找平均短语计数。

我不会比这更具体，因为这显然是一个家庭作业：p

python - 每个句子的python平均短语

1 回答 1

Related

Reference