1

我想拆分此文本。我正在尝试使用 JavaScript 正则表达式来做到这一点。

(1) 真的没有。(2) 嗯。(3)看哪,王子(4)是他自然元素的关键,畏缩在他生命中女人的摆布之下。(5) 见我,也许你想和我的女儿们一起吐口水,教她们一些组合。(6) 陛下,您无疑是最好的老师。(7) 例如,是我女儿教我现代世界的语言。

我想将其解析为片段组。我正在寻找这些结果之一。

[
  [1, "Really not."],
  [2, "Uh huh."],
  [3, "Behold Prince"],
]


[
  {id: 1, text: "Really not."},
  {id: 2, text: "Uh huh."},
  {id: 3, text: "Behold Prince"},
]

我使用这种模式。

/\(([0-9])\){1,3}(.+?)\(/g

请问你能帮帮我吗?我应该使用什么模式来正确拆分文本?

先感谢您!

4

3 回答 3

3

你可以在 javascript 中使用 regex 和 string.matchAll 函数来做你想做的事

const str = `(1) Really not. (2) Uh huh. (3) Behold Prince (4) are key in his natural element, cowering at the mercy of the women in his life. (5) See me perhaps you'd like to spout with my daughters and teach them some combination. (6) No doubt you are the best teacher, your majesty. (7) It is my daughter's that teach me in the languages of the modern world, for instance.`;

let array = [...str.matchAll(/\(([0-9]+)\)\s*(.*?)\s*(?=$|\()/g)].map(a=>[+a[1],a[2]])

console.log(array)

我使用第四只鸟的正则表达式更新了我的答案,因为它比我写的正则表达式干净得多。

于 2021-06-02T12:20:12.983 回答
2

(您可以断言它或字符串的结尾而不是匹配。

这部分\){1,3}意味着重复右括号1-3次。

如果要匹配 1-3 位数字:

\(([0-9]+)\)\s*(.*?)\s*(?=$|\()
  • \(匹配(
  • ([0-9]+)在第 1 组 中捕获 1+ 个数字m[1]在代码中表示为)
  • \)匹配)
  • \s*匹配可选的空白字符
  • (.*?)在第 2 组 中捕获尽可能少的字符 m[2]在代码中表示)
  • \s*匹配可选的空格 chas
  • (?=$|\()断言字符串的结尾或(右侧

正则表达式演示

const regex = /\(([0-9]+)\)\s*(.*?)\s*(?=$|\()/g;
const str = `(1) Really not. (2) Uh huh. (3) Behold Prince (4) are key in his natural element, cowering at the mercy of the women in his life. (5) See me perhaps you'd like to spout with my daughters and teach them some combination. (6) No doubt you are the best teacher, your majesty. (7) It is my daughter's that teach me in the languages of the modern world, for instance.`;
console.log(Array.from(str.matchAll(regex), m => [m[1], m[2]]));

于 2021-06-02T12:20:57.313 回答
1

... 一种基于matchAll以及RegExp使用命名捕获组正向前瞻的方法... /\((?<id>\d+)\)\s*(?<text>.*?)\s*(?=$|\()/g...

// see ... [https://regex101.com/r/r39BoJ/1]
const regX = (/\((?<id>\d+)\)\s*(?<text>.*?)\s*(?=$|\()/g);

const text = "(1) Really not. (2) Uh huh. (3) Behold Prince (4) are key in his natural element, cowering at the mercy of the women in his life. (5) See me perhaps you'd like to spout with my daughters and teach them some combination. (6) No doubt you are the best teacher, your majesty. (7) It is my daughter's that teach me in the languages of the modern world, for instance."

console.log([
  ...text.matchAll(regX)
  ].map(
    ({groups: { id, text }}) => ({ id: Number(id), text })
  )
);
.as-console-wrapper { min-height: 100%!important; top: 0; }

笔记

上述方法不包括(文本片段中开头paren/的出现(允许存在)。因此,为了始终处于保存状态,OP 应该考虑基于split/reduce的方法......

const text = "  (1) Really not. (2) Uh (huh). (3) Behold Prince (4) are key in his natural element, cowering at the mercy of the women in his life. (5) See me perhaps you'd like to spout with my daughters and teach them some combination. (6) No doubt you are the best teacher, your majesty. (7) It is my daughter's that teach me in the languages of the modern world, (for instance).  "

console.log(
  text
    .split(/\s*\((\d+)\)\s*/)
    .slice(1)
    .reduce((list, item, idx) => {
      if (idx % 2 === 0) {
        list.push({ id: Number(item) });
      } else {
        // list.at(-1).text = item;
        list[list.length - 1].text = item.trim();
      }
      return list;
    }, [])
);

// test / check ...
console.log(
  'text.split(/\s*\((\d+)\)\s*/) ...',
  text.split(/\s*\((\d+)\)\s*/)
);
.as-console-wrapper { min-height: 100%!important; top: 0; }

于 2021-06-02T12:57:55.817 回答