0

目前,我有一本包含法律决定的字典,看起来像这样(总共 391 个决定)。

data = [{'text': "ECLI:NL:GHLEE:2002:AL8039   Instantie  Gerechtshof Leeuwarden  Datum uitspraak  18-10-2002  Datum publicatie  08-10-2003  Zaaknummer   BK 866/98 Vennootschapsbelasting   Rechtsgebieden   Belastingrecht   Bijzondere kenmerken"},
{'text': "ECLI:NL:GHARL:2014:5893   Instantie  Gerechtshof Arnhem-Leeuwarden  Datum uitspraak  15-07-2014  Datum publicatie  01-08-2014  Zaaknummer   14/00030   Formele relaties  Eerste aanleg: ECLI:NL:RBGEL:2013:4925 , Bekrachtiging/bevestiging   Rechtsgebieden   Belastingrecht   Bijzondere kenmerken"},
{'text': "ECLI:NL:GHARL:2015:7518   Instantie  Gerechtshof Arnhem-Leeuwarden  Datum uitspraak  06-10-2015  Datum publicatie  16-10-2015  Zaaknummer   14/01259   Formele relaties  Eerste aanleg: ECLI:NL:RBGEL:2014:6894 , Bekrachtiging/bevestiging Cassatie: ECLI:NL:HR:2016:2736    **Rechtsgebieden   Strafrecht** Bijzondere kenmerken"}]

在这个项目中,我想删除字符串中存在“Rechtsgebieden Strafrecht”的元素。因此,我必须遍历所有元素,然后删除整个索引号

{'text': "ECLI:NL:GHARL:2015:7518   Instantie  Gerechtshof Arnhem-Leeuwarden  Datum uitspraak  06-10-2015  Datum publicatie  16-10-2015  Zaaknummer   14/01259   Formele relaties  Eerste aanleg: ECLI:NL:RBGEL:2014:6894 , Bekrachtiging/bevestiging Cassatie: ECLI:NL:HR:2016:2736    **Rechtsgebieden   Strafrecht** Bijzondere kenmerken"}

我想到了这样的事情,但我似乎找不到获得正确索引号的解决方案(因为 data[d] 当然不起作用):

    substring = "Rechtsgebieden   Strafrecht"
    for d in data:
        if substring in str(d):
            del data[d]
4

1 回答 1

1

请看以下几点:

  1. 您应该使用d['text']而不是str(d)用于您的测试。
  2. 你不应该data在你迭代的时候修改你的。有关原因的更多详细信息,请参阅此问题

请参阅此错误行为示例

>>> data
[{'text': 'some text'}, {'text': 'text with value to be removed'}, {'text': 'text with value to be removed BUT WILL NOT BE REMOVED'}, {'text': 'some other text'}, {'text': 'this text is also fine'}]
>>> VALUE_TO_REMOVE
'value to be removed'
>>> for i, item in enumerate(data):
...     if VALUE_TO_REMOVE in item['text']:
...             del data[i]
... 
>>> data
[{'text': 'some text'}, {'text': 'text with value to be removed BUT WILL NOT BE REMOVED'}, {'text': 'some other text'}, {'text': 'this text is also fine'}]
>>> 
  1. 由于您无法在迭代期间从数据中删除条目,因此请创建一个新的数据列表,其中每个字典都满足您的条件。

首选正确方式:

>>> data
[{'text': 'some text'}, {'text': 'text with value to be removed'}, {'text': 'text with value to be removed BUT WILL NOT BE REMOVED'}, {'text': 'some other text'}, {'text': 'this text is also fine'}]
>>> VALUE_TO_REMOVE
'value to be removed'
>>> new_data = [x for x in data if VALUE_TO_REMOVE not in x['text']]
>>> new_data
[{'text': 'some text'}, {'text': 'some other text'}, {'text': 'this text is also fine'}]
>>> 
  1. 如果您的数据很大并且您不想制作副本,您可以使用“请勿使用”键标记各个词典,然后在处理时使用它。见下文:
>>> data
[{'text': 'some text'}, {'text': 'text with value to be removed'}, {'text': 'text with value to be removed BUT WILL NOT BE REMOVED'}, {'text': 'some other text'}, {'text': 'this text is also fine'}]
>>> for item in data:
...     if VALUE_TO_REMOVE in item['text']:
...             item['DO NOT USE'] = True
... 
>>> data
[{'text': 'some text'}, {'text': 'text with value to be removed', 'DO NOT USE': True}, {'text': 'text with value to be removed BUT WILL NOT BE REMOVED', 'DO NOT USE': True}, {'text': 'some other text'}, {'text': 'this text is also fine'}]
>>> 
于 2021-04-08T10:08:55.997 回答