1

我有 2 个列表,一个有很多组件,另一个有组件及其描述。我需要找到一种方法来过滤掉所有无用的信息,同时保持描述列表的顺序与组件列表的顺序相同。

我曾尝试使用列表理解,但这并没有给我预期的结果。

lst = [] 
for i in range (len(components)):
   lst.append([x for x in description if components[i] in x])

这是 2 个变量的简短版本;

components = ['INVALID' , 'R100' , 'R101' , 'C100' , 'R100' , 'R100']
description = [
'  30_F "30_F";',
'  POWER_IN1 Supply   2 At     5 Volts, 0.8 Amps;',
'  R100       OPEN PN"10057609" "RES S 5mOhm 1% 2512_H6_1(T)";',
'  R101          100     5     5 f PN"66151002538" "CH-WID_ 100R -5-RR 0603 (B)";',
'  C100          100n    10    10 f PN"10210616" "CFCAP X7R S 100nF 50V (T)";',
'  R100       OPEN PN"10057609" "RES S 5mOhm 1% 2512_H6_1(T)";',
'  R100       CLOSED PN"10057609" "RES S 5mOhm 1% 2512_H6_1 (T)"      VERSION 12046547;']

我期望的输出是;

'  INVALID    No description'
'  R100       OPEN PN"10057609" "RES S 5mOhm 1% 2512_H6_1(T)";'
'  R101       100     5     5 f PN"66151002538" "CH-WID_ 100R -5-RR 0603 (B)";'
'  C100       100n    10    10 f PN"10210616" "CFCAP X7R S 100nF 50V (T)";'
'  R100       OPEN PN"10057609" "RES S 5mOhm 1% 2512_H6_1(T)";'
'  R100       CLOSED PN"10057609" "RES S 5mOhm 1% 2512_H6_1 (T)"      VERSION 12046547;
4

5 回答 5

1

具有str.startswith功能、辅助可见位置序列和 Python 的for/else特性:

import pprint

...  # your input data variables

seen_pos = []
res = []
for comp in components:
    for i, desc in enumerate(description):
        if i not in seen_pos and desc.strip().startswith(comp):
            seen_pos.append(i)
            res.append('{:<10}{}'.format(comp, desc.strip().replace(comp, '', 1).strip()))
            break
    else:
        res.append('{:<10}{}'.format(comp, 'No description'))

pprint.pprint(res, width=100)

输出:

['INVALID   No description',
 'R100      OPEN PN"10057609" "RES S 5mOhm 1% 2512_H6_1(T)";',
 'R101      100     5     5 f PN"66151002538" "CH-WID_ 100R -5-RR 0603 (B)";',
 'C100      100n    10    10 f PN"10210616" "CFCAP X7R S 100nF 50V (T)";',
 'R100      OPEN PN"10057609" "RES S 5mOhm 1% 2512_H6_1(T)";',
 'R100      CLOSED PN"10057609" "RES S 5mOhm 1% 2512_H6_1 (T)"      VERSION 12046547;']
于 2019-08-15T14:14:40.810 回答
1
[x for x in description if x.split()[0] in components]
于 2019-08-15T14:17:41.100 回答
1

一种使用re. 它将保持components列表中定义的顺序:

components = ['R100' , 'R101' , 'C100' , 'R100' , 'R100']
description = [
'  30_F "30_F";',
'  POWER_IN1 Supply   2 At     5 Volts, 0.8 Amps;',
'  R100       OPEN PN"10057609" "RES S 5mOhm 1% 2512_H6_1(T)";',
'  R101          100     5     5 f PN"66151002538" "CH-WID_ 100R -5-RR 0603 (B)";',
'  C100          100n    10    10 f PN"10210616" "CFCAP X7R S 100nF 50V (T)";',
'  R100       OPEN PN"10057609" "RES S 5mOhm 1% 2512_H6_1(T)";',
'  R100       CLOSED PN"10057609" "RES S 5mOhm 1% 2512_H6_1 (T)"      VERSION 12046547;']

import re

c = iter(components)

filtered = []
current = next(c)
for line in description:
    if current and re.findall(r'^\s*{}\s*'.format(re.escape(current)), line):
        filtered.append(line)
        current = next(c, None)

from pprint import pprint
pprint(filtered, width=150)

印刷:

['  R100       OPEN PN"10057609" "RES S 5mOhm 1% 2512_H6_1(T)";',
 '  R101          100     5     5 f PN"66151002538" "CH-WID_ 100R -5-RR 0603 (B)";',
 '  C100          100n    10    10 f PN"10210616" "CFCAP X7R S 100nF 50V (T)";',
 '  R100       OPEN PN"10057609" "RES S 5mOhm 1% 2512_H6_1(T)";',
 '  R100       CLOSED PN"10057609" "RES S 5mOhm 1% 2512_H6_1 (T)"      VERSION 12046547;']
于 2019-08-15T14:23:37.140 回答
1

只需使用带有基本过滤的简单列表推导

>>> res = [d for d in description if d.strip().split(' ', 1)[0] in components]
>>> pprint(res)
['  R100       OPEN PN"10057609" "RES S 5mOhm 1% 2512_H6_1(T)";',
 '  R101          100     5     5 f PN"66151002538" "CH-WID_ 100R -5-RR 0603 (B)";',
 '  C100          100n    10    10 f PN"10210616" "CFCAP X7R S 100nF 50V (T)";',
 '  R100       OPEN PN"10057609" "RES S 5mOhm 1% 2512_H6_1(T)";',
 '  R100       CLOSED PN"10057609" "RES S 5mOhm 1% 2512_H6_1 (T)"      VERSION 12046547;']
于 2019-08-15T14:29:09.167 回答
1

更新OP 改变了问题。检查'INVALID'增加了一个额外的复杂层,这个答案没有涵盖。


循环遍历 中的字符串description,如果其中有任何字符串,则将它们添加到列表components中。

comp_set = set(components)
filtered = [d for d in description if any(c in d for c in comp_set)]

for x in filtered:
    print(x)

输出:

  R100       OPEN PN"10057609" "RES S 5mOhm 1% 2512_H6_1(T)";
  R101          100     5     5 f PN"66151002538" "CH-WID_ 100R -5-RR 0603 (B)";
  C100          100n    10    10 f PN"10210616" "CFCAP X7R S 100nF 50V (T)";
  R100       OPEN PN"10057609" "RES S 5mOhm 1% 2512_H6_1(T)";
  R100       CLOSED PN"10057609" "RES S 5mOhm 1% 2512_H6_1 (T)"      VERSION 12046547;
于 2019-08-15T14:40:02.280 回答