(如果您对支持嵌套括号的示例感兴趣,我在此答案的底部添加了一个)
这个实现不是纯正则表达式,但是,在我看来它是可以理解的。它循环遍历字符串,并以非常简单的方式完全按照您指定的方式执行。
假设我们有我们的字符串:
var str="and something here ( something else here and something else or something else) and something here or something here ( something else here and something else or something else)";
我们可以根据相关的标点符号对其进行标记:
var tokens = str.split(/( |\(|\))/g)
结果是:
["and", " ", "something", " ", "here", " ", "", "(", "", " ", "something", " ", "else", " ", "here", " ", "and", " ", "something", " ", "else", " ", "or", " ", "something", " ", "else", ")", "", " ", "and", " ", "something", " ", "here", " ", "or", " ", "something", " ", "here", " ", "", "(", "", " ", "something", " ", "else", " ", "here", " ", "and", " ", "something", " ", "else", " ", "or", " ", "something", " ", "else", ")", ""]
现在,我们可以迭代这些标记并简单地检查句子: var str="and something here (some else here and something else or something else) and something here or something here (some else here and something else or something else)";
var tokens = str.split(/( |\(|\))/g);
var inParans = false;
var sentences = [];
var lastIndex = 0;
for(var i=0;i<tokens.length;i++){
if(tokens[i] === "("){
inParans = true;
} else
if(tokens[i] === ")"){
inParans = false;
} else
if((tokens[i] === "and" || tokens[i] === "or") && !inParans){
sentences.push(tokens.slice(lastIndex,i).join("")); // add sentence
lastIndex = i;
}
}
sentences.push(tokens.slice(lastIndex).join(""));
document.body.innerHTML = (sentences.join("<br />"));
如果您想匹配嵌套的参数
对于 CS 理论中的正则表达式,由于泵引理(它们没有内存) ,不可能正确匹配嵌套数据。但是,使用我们的分词器,因为我们一开始并没有将自己限制在 RegExp 中,所以添加这种东西很容易,我们只计算括号。与正则表达式(严格意义上没有记忆)不同,我们可以使用变量轻松跟踪。这是这样的代码:
var tokens = str.split(/( |\(|\))/g);
var inParans = 0;
var sentences = [];
var lastIndex = 0;
for(var i=0;i<tokens.length;i++){
if(tokens[i] === "("){
inParans++;
} else
if(tokens[i] === ")"){
inParans--;
if(inParans < 0){ //invalid syntax
throw new Error("Invalid syntax");
}
//If you don't want this to be an error, you can do what Scott suggested and do
// inParans = Math.max(inParans - 1, 0);
} else
if((tokens[i] === "and" || tokens[i] === "or") && (inParans===0)){ // no nesting added check
sentences.push(tokens.slice(lastIndex,i).join("")); // add sentence
lastIndex = i;
}
}
sentences.push(tokens.slice(lastIndex).join(""));
document.body.innerHTML = (sentences.join("<br />"));