-4

我有来自工作流软件的以下评论行。需要从第一行和其余评论中提取一些部分。

这是示例

Nelly Thomas (Approve) 12/27/2012 8:50 PM - 12/27/2012 8:52 PM
(Nelly Thomas) LazyApproval by nelly.thomas@joshworld.local Approved

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, 
when an unknown printer took a galley of type and scrambled it to make a type specimen book

现在需要像这样提取它。

Nelly Thomas 12/27/2012 8:50 PM

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book

我需要一个正则表达式来实现这一点。

4

1 回答 1

0

好吧,给你:

var s = "Nelly Thomas (Approve) 12/27/2012 8:50 PM - 12/27/2012 8:52 PM\n\
(Nelly Thomas) LazyApproval by nelly.thomas@joshworld.local Approved\n\
\n\
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, \n\
when an unknown printer took a galley of type and scrambled it to make a type specimen book";
s.replace(/(.+)\(.+\)\s((\d\d\/){2}\d{4}\s\d{1,2}:\d\d\s\w\w)\s-\s.+[\n|\r].+[\n|\r]{2}([^]+)/gi, '$1$2\n\n$4');

//Result: 
"Nelly Thomas 12/27/2012 8:50 PM

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, 
when an unknown printer took a galley of type and scrambled it to make a type specimen book"

这是一个有效的正则表达式,但它不是特别漂亮:

# /                   --> Regex start:
# (.+)                --> a word (group #1)
# \(.+\)\s            --> followed by a word in () and a space.
# ((\d\d\/){2}\d{4}\s --> followed by a date and
# \d{1,2}:\d\d\s\w\w) --> time (group #2)
# \s-\s               --> followed by ` - `
# .+                  --> followed by any number of letters or spaces. (The 2nd date)
# [\n|\r]             --> followed by a newline.
# .+                  --> followed by any number of letters or spaces. (The 2nd line)
# [\n|\r]{2}          --> followed by 2 newlines.
# ([^]+)              --> followed by _any_ character, including newlines (group 4)
# /gi                 --> Regex end, (g)lobal flag, case (i)nsensitive flag.

然后,输出组和1,在和之间使用双换行符。2424

所以,它很丑,但它可以工作,只要文本遵循这种格式:

W (W) DD\DD\DDDD D(?D):DD LL - W
W

W

其中D是单个数字,L是字母,W是任意数量的单词和空格,不包括换行符,D(?D)表示一位或两位数字。

于 2013-01-09T07:51:19.747 回答