python - 如何在熊猫数据框上返回空值或无？

Question

样本数据：https ://docs.google.com/spreadsheets/d/1s6MzBu5lFcc-uUZ9B6CI1YR7P1fDSm4cByFwKt3ckgc/edit?usp=sharing

我有这个函数，它使用 textacy 来提取源属性。这会自动返回引用的说话者、提示和内容。在我的数据集中，有些段落有几个引用，但我只需要第一个，这就是我将 BREAK 放在 for 循环中的原因。

我现在的问题是一些原始数据没有引号，所以我希望函数不仅会跳过它，它还会返回一些东西。我相信问题出在EXCEPT之后：

它返回如下内容：

但它应该跳过第一行，因为第一行返回一个错误，所以我希望它看起来像这样：

import textacy 
from textacy import extract
import spacy

def extract_direct(text):
    extracted = pd.DataFrame()
    for i in text:
        try:
            doc = nlp(i)
            a = ex.direct_quotations(doc)
            for item in a:
                mined = {'speaker': item.speaker, 'cue': item.cue, 'content': item.content}
                extracted = extracted.append(mined, ignore_index = True)
                break
        except ValueError:
            continue
    contents = news_only['index']
    extracted = pd.concat([extracted, contents], ignore_index=True)
    return(extracted)

extract_direct(dataframe['Body'])

score 0 · Accepted Answer

我这样做是为了解决问题。必须在 Try 和 except 处附加两个实例。

def extract_direct(text):
extracted = pd.DataFrame()
for i in text:
    try:
        doc = nlp(i)
        a = ex.direct_quotations(doc)
        for item in a:
            mined = {'speaker': item.speaker, 'cue': item.cue, 'content': item.content}
            extracted = extracted.append(mined, ignore_index = True)
            break
    except ValueError:
        mined = {'speaker': 'None', 'cue': 'None', 'content': 'None'}
        extracted = extracted.append(mined, ignore_index = True)
return(extracted)

python - 如何在熊猫数据框上返回空值或无？

1 回答 1

Related

Reference