我正在使用语音转文本应用程序,它提供转录文件作为输出。转录的文本包含一些标签,如(s)
(用于句子开头).. (/s)
(用于句子结尾).. (VOCAL_NOISE)
(用于无法识别的单词).. 但是文本还包含不需要的标签,如(VOCAL_N)
, (VOCAL_NOISED)
, (VOCAL_SOUND)
, (UNKNOWN)
.. 我正在使用 SED 处理文本.. 但无法编写适当的正则表达式来替换除(s)
,(/s)
和(VOCAL_NOISE)
, 之外的所有其他标签~NS
.. 如果有人可以帮助我,我将不胜感激它..
示例文本:
(s) Hi Stacey , this is Stanley (/s) (s) I would (VOCAL_N) appreciate if you could call (UNKNOWN) and let him know I want an appointment (VOCAL_NOISE) with him (/s)
输出应该是:
(s) Hi Stacey , this is Stanley (/s) (s) I would ~NS appreciate if you could call ~NS and let him know I want an appointment (VOCAL_NOISE) with him (/s)