perl - 如何在 awk 或 perl（或 python，或 ...）中编写此 sed/bash 命令？

Question

我需要用新值替换脚本语言的实例Progress (n,m)和Progress label="some text title" (n,m)(N,M)

N= integer ((n/m) * normal)
M= integer ( normal )

进度语句可以在脚本行的任何位置（更糟糕的是，尽管不是当前脚本，但可以跨行拆分）。

该值normal是 1 到 255 之间的指定数字，n并且m是浮点数

到目前为止，我的sed实现如下。它仅适用于Progress (n,m)格式而不适用于格式Progress label="Title" (n,m)，但它只是简单的坚果：

#!/bin/bash
normal=$1; 
file=$2
for n in $(sed -rn '/Progress/s/Progress[ \t]+\(([0-9\. \t]+),([0-9\. \t]+)\).+/\1/p' "$file" )
do 
    m=$(sed -rn "/Progress/s/Progress[ \t]+\(${n},([0-9\. \t]+).+/\1/p" "$file")
    N=$(echo "($normal * $n)/$m" | bc)
    M=$normal
    sed -ri "/Progress/s/Progress[ \t]+\($n,$m\)/Progress ($N,$M)/" "$file"
done

简单地说：这行得通，但是，有没有更好的方法？

我的工具箱中有sed和bash脚本，而不是那么多perl，awk我认为这个问题更适合。

编辑样本输入。

Progress label="qt-xx-95" (0, 50) thermal label "qt-xx-95" ramp(slew=.75,sp=95,closed) Progress (20, 50) Pause  5 Progress (25, 50) Pause  5 Progress (30, 50) Pause  5 Progress (35, 50) Pause  5 Progress (40, 50) Pause  5 Progress (45, 50) Pause  5 Progress (50, 50)
Progress label="qt-95-70" (0, 40) thermal label "qt-95-70" hold(sp=70)        Progress (10, 40) Pause  5 Progress (15, 40) Pause  5 Progress (20, 40) Pause  5 Progress (25, 40) Pause  5

score 1 · Accepted Answer

awk具有良好的拆分能力，因此它可能是解决此问题的不错选择。

这是一个适用于所提供输入的解决方案，我们称之为update_m_n_n.awk。在 bash: 中像这样运行它awk -f update_m_n_n.awk -v normal=$NORMAL input_file。

#!/usr/bin/awk

BEGIN {
  ORS = RS = "Progress"
  FS = "[)(]"
  if(normal == 0) normal = 10
}

NR == 1 { print }

length > 1 { 
  split($2, A, /, */)
  N = int( normal * A[1] / A[2] )
  M = int( normal )
  sub($2, N ", " M)
  print $0
}

解释

ORS = RS = "Progress"：在输出中拆分部分Progress并包含Progress在输出中。
FS = "[)(]": 括号中的单独字段。
NR == 1 { print }:ORS在第一部分之前插入。
split($2, A, /, */)：假设在的出现之间只有括号中的项目Progress，这将拆分m并n放入A数组中。
sub($2, N ", " M)：将新值替换为当前记录。

score 1 · Accepted Answer

这有点脆弱，但似乎可以解决问题？可以使用 perl -pe 将其更改为单行，但我认为这更清楚：


use 5.16.0;
my $normal = $ARGV[0];
while(<STDIN>){
        s/Progress +(label=\".+?\")? *( *([0-9. ]+) *, *([0-9. ]+) *)/sprintf("Progress $1 (%d,%d)", int(($2/$3)*$normal),int($normal))/eg;
        print $_;

}

基本思想是在$1中可选地捕获label子句，并将n和m捕获到$2和$3中。我们使用 perl 的能力，通过提供“e”修饰符，用一段评估的代码替换匹配的字符串。如果 label 子句有任何转义的引号或包含与看起来像 Progress toekn 的字符串匹配的字符串，那么它将会大大失败，因此它并不理想。我同意您在这里需要一个诚实的解析器，尽管您可以修改此正则表达式以纠正一些明显的缺陷，例如 n 和 m 的弱数字匹配。

score 0 · Accepted Answer

我最初的想法是尝试sed使用递归替换（t命令），但是我怀疑这会卡住。

此perl代码可能适用于未跨行拆分的语句。对于跨行的拆分，编写一个单独的预处理器来连接不同的行也许是有意义的。

该代码将“进度”语句拆分为单独的行段，应用任何替换规则，然后将这些段重新连接为一行并打印。不匹配的行被简单地打印。匹配的代码使用反向引用并且变得有些不可读。我假设您的“正常”参数可以采用浮动值，因为规范似乎不清楚。

#!/usr/bin/perl -w

use strict;

die("Wrong arguments") if (@ARGV != 2);
my ($normal, $file) = @ARGV;
open(FILE, '<', $file) or die("Cannot open $file");

while (<FILE>) {
    chomp();
    my $line = $_;

    # Match on lines containing "Progress"
    if (/Progress/) {

        $line =~ s/(Progress)/\n$1/go;    # Insert newlines on which to split
        my @segs = split(/\n/, $line);    # Split line into segments containing possibly one "Progress" clause

        # Apply text-modification rules
        @segs = map {
            if (/(Progress[\s\(]+)([0-9\.]+)([\s,]+)([0-9\.]+)(.*)/) {
                my $newN = int($2/$4 * $normal);
                my $newM = int($normal);
                $1 . $newN . $3 . $newM . $5;
            } elsif (/(Progress\s+label="[^"]+"[\s\(]+)([0-9\.]+)([\s,]+)([0-9\.]+)(.*)/) {
                my $newN = int($2/$4 * $normal);
                my $newM = int($normal);
                $1 . $newN . $3 . $newM . $5;
            } else {
                $_;    # Segment doesn't contain "Progress"
            }
        } @segs;

        $line = join("", @segs);    # Reconstruct the single line
    }

    print($line,"\n");    # Print all lines
}

perl - 如何在 awk 或 perl（或 python，或 ...）中编写此 sed/bash 命令？

3 回答 3

解释

Related

Reference