python - 在大写字母前插入逗号的pythonic方法[正则表达式]

Question

我的 Regex-fu 严重缺乏，我无法理解它......任何帮助都得到了很大的帮助。

我正在寻找一种 Python 方法来解析一个旧软件（我没有源代码访问权限）吐出的字符串：

,Areas for further improvement,The school’s leaders are rightly seeking to improve the following areas:,,========2========,,3/5,Continue to focus on increasing performance at the higher levelsPupils’,literacy and numeracy skills across the curriculumStandards,in science throughout the schoolPupils’,numerical reasoning skills

我想做的是：

(1) 删除所有现有, : = /字符以形成单个连续字符串：

Areas for further improvementThe school’s leaders are rightly seeking to improve the following areas23/5Continue to focus on increasing performance at the higher levelsPupils’literacy and numeracy skills across the curriculumStandardsin science throughout the schoolPupils’numerical reasoning skills

然后在每个大写字母前面加上一个,，以便我可以将字符串用作合理的 csv 输入......

,Areas for further improvement,The school’s leaders are rightly seeking to improve the following areas23/5,Continue to focus on increasing performance at the higher levels,Pupils’literacy and numeracy skills across the curriculum,Standardsin science throughout the school,Pupils’numerical reasoning skills

我很感激这会给我一个先例，但是当我写文件时我可以把它去掉。

这可以通过 are.sub()和 regex-fu 实现吗？

（很高兴这是一个两步过程 - 删除现有的垃圾字符，然后添加，前面的大写字母）

有人可以保存我的正则表达式吗？

干杯

score 3 · Accepted Answer

re.sub(r'([A-Z])', r',\1', re.sub(r'[,:=/]', '', input_))

输出：

',Areas for further improvement,The school’s leaders are rightly seeking to improve the following areas235,Continue to focus on increasing performance at the higher levels,Pupils’literacy and numeracy skills across the curriculum,Standardsin science throughout the school,Pupils’numerical reasoning skills'

score 1 · Accepted Answer

您可以申请re.sub两次：

import re
s = ',Areas for further improvement,The school’s leaders are rightly seeking to improve the following areas:,,========2========,,3/5,Continue to focus on increasing performance at the higher levelsPupils’,literacy and numeracy skills across the curriculumStandards,in science throughout the schoolPupils’,numerical reasoning skills'
new_s = re.sub('[A-Z]', lambda x:f',{x.group()}', re.sub('[,:\=]+', '', s))

输出：

',Areas for further improvement,The school’s leaders are rightly seeking to improve the following areas23/5,Continue to focus on increasing performance at the higher levels,Pupils’literacy and numeracy skills across the curriculum,Standardsin science throughout the school,Pupils’numerical reasoning skills'

python - 在大写字母前插入逗号的pythonic方法[正则表达式]

2 回答 2

Related

Reference