首先,下降解析器总是会给你更好的控制。
一个完整的正则表达式解决方案可以让你跳过错误,或者至少让你继续
而不是废话。
您的格式非常简单,并且可以进行内部递归的引擎
至少可以为您提供外部匹配。使用语言递归,您可以重新输入该正则表达式,
使您能够解析核心。
我不是 php 专家,但如果它支持正则表达式递归和语言级别的eval(),您
将能够将数组构造注入源文本。
然后评估字符串以创建嵌套数组图像,并带有参数。
实际上,我将您的文本转换为大约 12 行 Perl 中的数组,但是
当它变得有趣时添加到其中。
这是一个 Perl 示例。它被简化了,所以它的可读性。它可能会给你一些在 php 中尝试它的灵感(如果它可以做这些事情)。就像我说我不是 php 专家。
use Data::Dumper;
my $str = '
asdf("asg")
isNumeric(right(trim(contract_id),1))
var = \'aqfbasdn\'
isNumeric(right(trim ( ,contract_id,),-1, j( ) ," ", bob, george(five(four, two))))
';
my $func = '\w+'; # Allowed characters (very watered down)
my $const = '[\w*&^+-]+';
my $wspconst = '[\w*&^+\s-]+';
my $GetRx = qr~
\s*
( # 1 Recursion group
(?:
\s* ($func) \s*
[(]
(?: (?> (?: (?!\s*$func\s*[(] | [)] ) . )+ )
| (?1)
)*
[)]
)
)
~xs;
my $ParseRx = qr~
( # 1 Recursion group
(?:
\s* ($func) \s* # 2 Function name
[(]
( # 3 Function core
(?: (?> (?: (?!\s*$func\s*[(] | [)] ) . )+ )
| (?1)
)*
)
[)]
# OR..other stuff
# Note that this block of |'s is where
# to put code to parse constants, strings,
# delimeters, etc ... Not much done, but
# here is where that goes.
# -----------------------------------------
| \s*["'] ($wspconst) ["']\s* # 4,5 Variable constants
| \s* ($const) \s*
# Lastly, accept empty parameters, if
| (?<=,) # a comma behind us,
| (?<=^)(?!\s*$) # or beginning of a new 'core' if actually a paramater.
)
)
~xs;
##
print "Source string:\n$str\n";
print "=======================================\n";
print "Searching string for functions ...\n";
print "=======================================\n\n";
while ($str =~ /$GetRx/g) {
print "------------------\nParsing:\n$1\n\n";
my $res = parse_func($1);
print "String to be eval()'ed:\n$res\n\n";
my $hashref = eval $res.";";
print "Hash from eval()'ed string:\n", Dumper( $hashref ), "\n\n";
}
###
sub parse_func
{
my ($core) = @_;
$core =~ s/$ParseRx/ parse_callback($2, $3, "$4$5") /eg;
return $core;
}
sub parse_callback
{
my ($fname, $fbody, $fconst) = @_;
if (defined $fbody) {
return "{'$fname'=>[" . (parse_func( $fbody )) . "]}";
}
return "'$fconst'"
}
输出
Source string:
asdf("asg")
isNumeric(right(trim(contract_id),1))
var = 'aqfbasdn'
isNumeric(right(trim ( ,contract_id,),-1, j( ) ," ", bob, george(five(four, two))))
=======================================
Searching string for functions ...
=======================================
------------------
Parsing:
asdf("asg")
String to be eval()'ed:
{'asdf'=>['asg']}
Hash from eval()'ed string:
$VAR1 = {
'asdf' => [
'asg'
]
};
------------------
Parsing:
isNumeric(right(trim(contract_id),1))
String to be eval()'ed:
{'isNumeric'=>[{'right'=>[{'trim'=>['contract_id']},'1']}]}
Hash from eval()'ed string:
$VAR1 = {
'isNumeric' => [
{
'right' => [
{
'trim' => [
'contract_id'
]
},
'1'
]
}
]
};
------------------
Parsing:
isNumeric(right(trim ( ,contract_id,),-1, j( ) ," ", bob, george(five(four, two))))
String to be eval()'ed:
{'isNumeric'=>[{'right'=>[{'trim'=>['' ,'contract_id','']},'-1',{'j'=>[ ]} ,' ','bob',{'george'=>[{'five'=>['four','two']}]}]}]}
Hash from eval()'ed string:
$VAR1 = {
'isNumeric' => [
{
'right' => [
{
'trim' => [
'',
'contract_id',
''
]
},
'-1',
{
'j' => []
},
' ',
'bob',
{
'george' => [
{
'five' => [
'four',
'two'
]
}
]
}
]
}
]
};