2

我编写了一个管道 shell 命令,其中有多个管道,效果很好。我现在想把它放在一个(整洁的)shell 脚本的形式中。这是脚本:

#!/bin/bash
for number in `cat xmlEventLog_2010-03-23T* | sed -nr "/<event eventTimestamp/,/<\/event>/ {/event /{s/^.*$/\n/; p};/payloadType / {h; /protocol/ {s/.*protocol=\"([^\"]*)?\".*/protocol: \1/}; p; x; /type/ {s/.*type=\"([^\"]+)\".*/payload: \1/g}; /type/! {s/.*protocol=\"([^\"]+)\".*/payload: \1/g}; p};/sender / {/sccpAddress/ {s/.*sccpAddress=\"([^\"]*)?\".*/sccpAddress: \1/}; /sccpAddress/! {s/.*/sccpAddress: Unknown/}; p};/result /{s/.*value=\"([^\"]+)\".*/result: \1/g; p};/filter code/{s/.*type=\"([^\"]+)\".*/type: \1/g; p};}"| tee checkThis.txt| awk 'BEGIN{FS="\n"; RS=""; OFS=";"; ORS="\n"} $1~/result: Blocked|Modified/ && $2~/sccpAddress: 353201000001/ && $4~/payload: SMS-MO-FSM-INFO|SMS-MO-FSM/ {$1=$1 ""; print}' | sort | uniq -c| egrep "NUMBER_BLACKLIST|USER_BLACKLIST|NUMBER_WALLEDGARDEN|USER_WALLED_GARDEN|SERVICE_RESTRICTION|BLOCK_VOICE_TO_SMS|PEP_Blacklist_Whitelist" | awk '{print $1}'`; do fil="$fil+$number"
done
echo "fil is $fil"

我想整理一下,以便阅读。管道进入 sed 和 awk 的 for 循环很难看。有没有人建议整理这个管道怪物。管道会阻止我把它分成不同的线吗?

谢谢

一种

如果您将上面的行复制到记事本,您会明白我所说的丑陋(但功能)的意思

好的,伙计们。这是最终的清理版本。

有人提到 event_structure 函数可以完全在 awk 中完成。我想知道是否有人可以向我展示如何做到这一点的示例。记录分隔符将设置为 /event ,这将分隔事件,但我感兴趣的是 events.txt 中的结构(见下文)。数字结果无关紧要。

代码的核心在 event_structure 函数中。我想解析数据并将其全部放入数据结构中,以便在出现情况时进行检查。以下工作正常。在以 payloadType 开头的行上,我需要解析出 2 个值或将任何缺失值设置为未知。这是完全 awkable 还是我这里的 sed/awk 组合是最好的方法?

#!/bin/bash

event_structure() {
      sed -nr "/<event eventTimestamp/,/<\/event>/ {
            /event /{s/^.*$/\n/; p}
            /payloadType / {h; /protocol/ {s/.*protocol=\"([^\"]*)?\".*/protocol: \1/}; p; x; /type/ {s/.*type=\"([^\"]+)\".*/payload: \1/g}; /type/! {s/.*protocol=\"([^\"]+)\".*/payload: \1/g}; p}
            /sender / {/sccpAddress/ {s/.*sccpAddress=\"([^\"]*)?\".*/sccpAddress: \1/}; /sccpAddress/! {s/.*/sccpAddress: Unknown/}; p}
            /result /{s/.*value=\"([^\"]+)\".*/result: \1/g; p}
            /filter code/{s/.*type=\"([^\"]+)\".*/type: \1/g; p};}" xmlEventLog_2010-03-23T* |
      tee events.txt|
      awk 'BEGIN{FS="\n"; RS=""; OFS=";"; ORS="\n"}
      $1~/result: Blocked|Modified/ && $2~/sccpAddress: 353201000001/ && $4~/payload: SMS-MO-FSM-INFO|SMS-MO-FSM/ {$1=$1 ""; print}'
}

numbers=$(event_structure | sort | uniq -c | egrep "NUMBER_BLACKLIST|USER_BLACKLIST|NUMBER_WALLEDGARDEN|USER_WALLED_GARDEN|SERVICE_RESTRICTION|BLOCK_VOICE_TO_SMS|PEP_Blacklist_Whitelist" | awk '{print $1}')
addition=`echo $numbers | tr -s ' \n\t' '+' | sed -e '1s/^/fil is /' -e '$s/+$//'`
for number in $numbers
do
      fil="$fil+$number"
done
echo $addition=$(($fil))

这是生成的 events.txt 文件的一部分:

result: Blocked
sccpAddress: 353869000000
protocol: SMS
payload: COPS
type: SERVICE_BLACK_LIST
result: Blocked


result: Blocked
sccpAddress: 353869000000
protocol: SMS
payload: COPS
type: SERVICE_BLACK_LIST
result: Blocked

result: Modified
sccpAddress: Unknown
protocol: IM
payload: IM
type: NUMBER_BLACKLIST
result: Modified

result: Allowed
sccpAddress: Unknown
protocol: MM1
payload: MM1

这是输出:

$ ./bashShell.sh
fil is 2+372+1+1+214+73+1+20=684

这是函数调用的输出:

$ ./bashShell.sh | head -10
result: Blocked;sccpAddress: 353201000001;protocol: SMS;payload: SMS-MO-FSM;type: TEXT_ANALYSIS;result: Blocked
result: Blocked;sccpAddress: 353201000002;protocol: SMS;payload: SMS-MT-FSM;type: TEXT_ANALYSIS;result: Blocked
result: Blocked;sccpAddress: 353201000005;protocol: SMS;payload: SMS-MO-FSM;type: SERVICE_BLACKLIST;result: Blocked
result: Blocked;sccpAddress: 353201000021;protocol: SMS;payload: SMS-MT-FSM;type: NUMBER_BLACKLIST;result: Blocked
result: Blocked;sccpAddress: 353201000033;protocol: IM;payload: IM;type: NUMBER_BLACKLIST;result: Blocked
result: Blocked;sccpAddress: 353401009001;protocol: SMS;payload: SMS-MO-FSM;type: NUMBER_BLACKLIST;result: Blocked
result: Blocked;sccpAddress: 353201000001;protocol: SMS;payload: SMS-MO-FSM;type: NUMBER_BLACKLIST;result: Blocked
result: Blocked;sccpAddress: 353201000005;protocol: SMS;payload: SMS-MO-FSM;type: NUMBER_BLACKLIST;result: Blocked
result: Blocked;sccpAddress: 353401000001;protocol: SMS;payload: SMS-MO-FSM;type: NUMBER_BLACKLIST;result: Blocked
result: Blocked;sccpAddress: 353201000001;protocol: SMS;payload: SMS-MO-FSM;type: NUMBER_BLACKLIST;result: Blocked

ps 我没有特别的原因将脚本命名为 bashShell.sh

一种

4

4 回答 4

3

当分成多行时,管道不会阻止您,而是使用$( ... )而不是反引号。像这样的东西应该工作:

#!/bin/bash

for number in $(
    cat xmlEventLog_2010-03-23T* |
    sed -nr "/<event eventTimestamp/,/<\/event>/ {/event /{s/^.*$/\n/; p};/payloadType / {h; /protocol/ {s/.*protocol=\"([^\"]*)?\".*/protocol: \1/}; p; x; /type/ {s/.*type=\"([^\"]+)\".*/payload: \1/g}; /type/! {s/.*protocol=\"([^\"]+)\".*/payload: \1/g}; p};/sender / {/sccpAddress/ {s/.*sccpAddress=\"([^\"]*)?\".*/sccpAddress: \1/}; /sccpAddress/! {s/.*/sccpAddress: Unknown/}; p};/result /{s/.*value=\"([^\"]+)\".*/result: \1/g; p};/filter code/{s/.*type=\"([^\"]+)\".*/type: \1/g; p};}"|
    tee checkThis.txt |
    awk 'BEGIN{FS="\n"; RS=""; OFS=";"; ORS="\n"} $1~/result: Blocked|Modified/ && $2~/sccpAddress: 353201000001/ && $4~/payload: SMS-MO-FSM-INFO|SMS-MO-FSM/ {$1=$1 ""; print}' |
    sort |
    uniq -c |
    egrep "NUMBER_BLACKLIST|USER_BLACKLIST|NUMBER_WALLEDGARDEN|USER_WALLED_GARDEN|SERVICE_RESTRICTION|BLOCK_VOICE_TO_SMS|PEP_Blacklist_Whitelist" |
    awk '{print $1}'
  ); do fil="$fil+$number"
done
echo "fil is $fil"

当然,更大的部分是将 awk 和 sed 脚本分成多行也......

但我相信,即使在那之后,结果仍然很不可读。

我建议用 Perl、Ruby 或任何其他比 Bash 更易读的脚本语言完全重写脚本。这只是我个人经验的一个建议——每次从 shell 脚本开始时,我最终都会用 Ruby 重写它。我喜欢 Bash,但它似乎无法扩展。

于 2010-09-23T13:36:05.830 回答
2

两个小说明:

将“for list”放在单独的函数中:

number_list() {
    # complete pipe command list
    # divided over multiple lines
}

for number in `number_list`
do
   # ...
done

尝试组合一些命令:cat不需要的,最终的egrepawk可以组合的。

于 2010-09-23T13:39:46.897 回答
1

您可以使用 tr 加入不同的标记,并使用 sed 在前面添加“fil is”:

pipeline | tr -s ' \n\t' '+' | sed -e '1s/^/fil is /' -e '$s/+$//'

可以使用以下命令将管道拆分为多行:

first-command \
    | second-command \
    | third-command \
    ...
    | last-command
于 2010-09-23T13:59:48.833 回答
1

shell 脚本实际上是最简单的部分。sed 脚本有点可怕。可以使用此处的文档改进脚本,但请见证评论:

#!/bin/bash

seds=/tmp/seds.$$
awks=/tmp/awks.$$
gres=/tmp/gres.$$

trap "rm -f $seds $awks $gres" 0 1 2 3 15

# this is a noble and hairy attempt to parse xml with sed
# it is extremely fragile and strongly dependent upon
# the form of the source file never changing
# I'm alternately proud or disgusted that I've been able
# to get away with this

cat > $seds <<'EOF'
/<event eventTimestamp/,/<\/event>/ {/event /{s/^.*$/\n/; p};
/payloadType / {h; /protocol/ {s/.*protocol=\"([^\"]*)?\".*/protocol: \1/}; p; x;
/type/ {s/.*type=\"([^\"]+)\".*/payload: \1/g};
/type/! {s/.*protocol=\"([^\"]+)\".*/payload: \1/g}; p};
/sender / {/sccpAddress/ {s/.*sccpAddress=\"([^\"]*)?\".*/sccpAddress: \1/};
/sccpAddress/! {s/.*/sccpAddress: Unknown/}; p};
/result /{s/.*value=\"([^\"]+)\".*/result: \1/g; p};
/filter code/{s/.*type=\"([^\"]+)\".*/type: \1/g; p};}
EOF

cat > $awks <<'EOF'
BEGIN {FS="\n"; RS=""; OFS=";"; ORS="\n"}
$1~/result: Blocked|Modified/ && \
$2~/sccpAddress: 353201000001/ && \
$4~/payload: SMS-MO-FSM-INFO|SMS-MO-FSM/ {$1=$1 ""; print}
EOF

cat > $gres <<EOF
NUMBER_BLACKLIST
USER_BLACKLIST
NUMBER_WALLEDGARDEN
USER_WALLED_GARDEN
SERVICE_RESTRICTION
BLOCK_VOICE_TO_SMS
PEP_Blacklist_Whitelist
EOF

cat xmlEventLog_2010-03-23T* | \
sed -nr -f $seds | \
tee checkThis.txt | \
awk -f $awks | \
sort | uniq -c | \
fgrep -f $gres | \
awk '{print $1}'
于 2010-09-23T14:55:29.597 回答