awk 支持这一点:
awk '{print $(NF-1);}'
但不适用于用户定义的变量:
awk '{a=123; b="a"; print $($b);}'
顺便说一句,shell 支持这一点:
a=123;
b="a";
eval echo \${$b};
如何在 awk 中实现我的目的?
OK, since some of us like to eat spaghetti through their nose, here is some actual code that I wrote in the past :-)
First of all, getting a self modifying code in a language that does not support it will be extremely non-trivial.
The idea to allow dynamic variables, function names, in a language that does not support one is very simple. At some state in the program, you want a dynamic anything to self modify your code, and resume execution
from where you left off. a eval()
, that is.
This is all very trivial, if the language supports eval()
and such equlavant. However, awk does not have such function. Therefore, you, the programmer has to provide a interface to such thing.
To allow all this to happen, you have three main problems
Here is a example code, suitable for direct execution. This one is the infastrucure that I inject for enviroments running gawk, as it requires PROCINFO
echo ""| awk '
function push(d){stack[stack[0]+=1]=d;}
function pop(){if(stack[0])return stack[stack[0]--];return "";}
function dbg_printarray(ary , x , s,e, this , i ){
x=(x=="")?"A":x;for(i=((s)?s:1);i<=((e)?e:ary[0]);i++){print x"["i"]=["ary[i]"]"}}
function dbg_argv(A ,this,p){
A[0]=0;p="/proc/"PROCINFO["pid"]"/cmdline";push(RS);RS=sprintf("%c",0);
while((getline v <p)>0)A[A[0]+=1]=v;RS=pop();close(p);}
{
print "foo";
dbg_argv(A);
dbg_printarray(A);
print "bar";
}'
Result:
foo
A[1]=[awk]
A[2]=[
function push(d){stack[stack[0]+=1]=d;}
function pop(){if(stack[0])return stack[stack[0]--];return "";}
function dbg_printarray(ary , x , s,e, this , i ){
x=(x=="")?"A":x;for(i=((s)?s:1);i<=((e)?e:ary[0]);i++){print x"["i"]=["ary[i]"]"}}
function dbg_argv(A ,this,p){
A[0]=0;p="/proc/"PROCINFO["pid"]"/cmdline";push(RS);RS=sprintf("%c",0);
while((getline v <p)>0)A[A[0]+=1]=v;RS=pop();close(p);}
{
print "foo";
dbg_argv(A);
dbg_printarray(A);
print "bar";
}]
bar
As you can see, as long as the OS does not play with our args, and /proc/
is available, it is possible
to read our self. This may appear useless at first, but we need it for push/pop of our stack,
so that our execution state can be enbedded within the code, so we can save/resume and survive OS shutdown/reboots
I have left out the OS detection function and the bootloader (written in awk), because, if I publish that, kids can build platform independent polynormal code, and it is easy to cause havoc with it.
Now, normaly you have push()
and pop()
for registers, so you can save your state and play with
your self, and resume from where you left off. a Call and reading your stack is a typical way to get the
memory address.
Unfortunetly, in awk, under normal situations we can not use pointers (with out a lot of dirty work), or registers (unless you can inject other stuff along the way). However you need a way to suspend and resume from your code.
The idea is simple. Instead of letting awk in control of your loops and while, if else conditions,
recrusion depth, and functions you are in, the code should.
Keep a stack, list of variable names, list of function names, and manage it your self.
Just make sure that your code always calls self_modify( bool )
constantly, so that even upon sudden failure,
As soon as the script is re-run, we can enter self_modify( bool )
and resume our state.
When you want to self modify your code, you must provide a custom made
write_stack()
and read_stack()
code, that writes out the state of stack as string, and reads string from
the values out from the code embedded string itself, and resume the execution state.
Here is a small piece of code that demonstrates the whole flow
echo ""| awk '
function push(d){stack[stack[0]+=1]=d;}
function pop(){if(stack[0])return stack[stack[0]--];return "";}
function dbg_printarray(ary , x , s,e, this , i ){
x=(x=="")?"A":x;for(i=((s)?s:1);i<=((e)?e:ary[0]);i++){print x"["i"]=["ary[i]"]"}}
function _(s){return s}
function dbg_argv(A ,this,p){
A[0]=0;p="/proc/"PROCINFO["pid"]"/cmdline";push(RS);RS=sprintf("%c",0);
while((getline v <p)>0)A[A[0]+=1]=v;RS=pop();close(p);}
{
_(BEGIN_MODIFY"|");print "#foo";_("|"END_MODIFY)
dbg_argv(A);
sub( \
"BEGIN_MODIFY\x22\x5c\x7c[^\x5c\x7c]*\x5c\x7c\x22""END_MODIFY", \
"BEGIN_MODIFY\x22\x7c\x22);print \"#"PROCINFO["pid"]"\";_(\x22\x7c\x22""END_MODIFY" \
,A[2])
print "echo \x22\x22\x7c awk \x27"A[2]"";
print "function bar_"PROCINFO["pid"]"_(s){print \x22""doe\x22}";
print "\x27"
}'
Result:
Exactly same as our original code, except
_(BEGIN_MODIFY"|");print "65964";_("|"ND_MODIFY)
and
function bar_56228_(s){print "doe"}
at the end of code
Now, this may seem useless, as we are only replaceing code print "foo";
with our pid.
But it becomes usefull, when there are multiple _() with separate MAGIC strings to identify BLOCKS,
and a custome made multi line string replacement routine instead of sub()
You msut provide BLOCKS for stack, function list, execution point, as a bare minimum.
And notice that the last line contains bar
This it self is just a sting, but when this code repeatedly gets executed, notice that
function bar_56228_(s){print "doe"}
function bar_88128_(s){print "doe"}
...
and it keeps growing. While the example is intentionally made so that it does nothing useful,
if we provide a routine to call bar_pid_(s)
instead of that print "foo"
code,
Sudenly it means we have eval()
on our hands :-)
Now, isn't eval() usefull :-)
Don't forget to provide a custome made remove_block() function so that the code maintains a reasonable size, instead of growing every time you execute.
Normally calling a binary is trivial. However, when doing so from with in awk, it becomes difficult. You may say system() is the way.
There are two problems to that.
If you must use system()
, ensure that it does not block.
A normal call to system("sleep 20 && echo from-sh & ")
will not work.
The solution is simple,
echo ""|awk '{print "foo";E="echo ep ; sleep 20 && echo foo & disown ; "; E | getline v;close(E);print "bar";}'
Now you have a async system() call that does not block :-)
现在不行。但是,如果你提供一个包装器,它是(有点老套和肮脏的)可能的。这个想法是使用在最新版本的 gawk 中引入的 @ 运算符。
这个@ 运算符通常用于按名称调用函数。所以如果你有
function foo(s){print "Called foo "s}
function bar(s){print "Called bar "s}
{
var = "";
if(today_i_feel_like_calling_foo){
var = "foo";
}else{
var = "bar";
}
@var( "arg" ); # This calls function foo(), or function bar() with "arg"
}
现在,这对它自己很有用。 假设我们事先知道 var 名称,我们可以编写一个包装器来间接修改和获取 var。
function get(varname, this, call){call="get_"varname;return @call();}
function set(varname, arg, this, call){call="set_"varname; @call(arg);}
因此,现在,对于您希望通过名称进行访问的每个 var 名称,您声明这两个函数
function get_my_var(){return my_var;}
function set_my_var(arg){my_var = arg;}
prahaps,在你的 BEGIN{} 块中的某个地方,
BEGIN{ my_var = ""; }
声明它以供全局访问。然后你可以使用
get("my_var");
set("my_var", "whatever");
起初这可能看起来没用,但是有一些非常好的用例,例如保持 var 的链表,通过将 var 的名称保存在另一个 var 的数组中,等等。它也适用于数组,老实说,我用它来嵌套和链接数组中的数组,所以我可以像使用指针一样遍历多个数组。
您还可以通过这种方式编写在 awk 中引用 var 名称的配置脚本,实际上也具有interpreter-inside-a-interpreter 类型的东西......
不是最好的做事方式,但是,它可以完成工作,我不必担心空指针异常或 GC 等:-)
该$
符号不是变量的标记,如 shell、PHP、Perl 等。它是一个运算符,它接收一个整数值n并从输入中返回第 n 列。因此,您在第一个示例中所做的不是动态设置/获取变量,而是调用运算符/函数。
正如评论者所说,您可以使用数组存档您正在寻找的行为:
awk '{a=123; b="a"; v[b] = a; print v[b];}'
我有一个类似的问题要解决,从“.ini”文件加载设置,我使用数组来动态设置变量。
它适用于 Awk 或 Gawk、Linux 或 Windows (GnuWin32)
gawk -v Settings_File="my_settings_file.ini" -f awk_script.awk <processing_file>
[my_settings_file.ini]
#comment
first_var=foo
second_var=bar
[awk_script.awk]
BEGIN{
FS="=";
while((getline < Settings_File)>0) {
if($0 !~ /^[#;]|^(\s*)$/) {
var_array[$1] = $2;
}
}
print var_array["first_var"];
print var_array["second_var"];
if (var_array["second_var"] == "bar") {
print "works!";
}
}
{
#more processing
}
END {
#finish processing
}