tcl - 在文件中查找搜索字符串并使用这些行在 TCL 中进行处理

Question

更准确地说：

我需要查看一个文件 abc.txt，其内容如下：

files/f1/atmp.c        98   100  

files/f1/atmp1.c       89   100 

files/f1/atmp2.c  !!   75   100

files/f2/btmp.c        92   100

files/f2/btmp2.c  !!   85   100

files/f3/xtmp.c        92   100

脚本需要找到“！！” 并使用这些行打印出以下内容作为输出：

atmp2.c  75

btmp2.c  85

有什么帮助吗？

score 1 · Accepted Answer

这应该可以解决问题。

set data {files/f1/atmp.c        98   100  
files/f1/atmp1.c       89   100 
files/f1/atmp2.c  !!   75   100
files/f2/btmp.c        92   100
files/f2/btmp2.c  !!   85   100
files/f3/xtmp.c        92   100}

set lines [split $data \n]
foreach line $lines {
  set match [regexp {(\S+)\s+!!\s+(\d+)} $line -> file num]
  if {$match} {puts "$file $num"}
}

虽然 regexp 有一个 -all 开关，但我认为我们不能在这里使用它，因为我们只能使用 -all 获得最后一个匹配变量

score 1 · Accepted Answer

如果您的文件不是很大，您可以将整个内容放入内存中，将这些行拆分为一个 TCL 列表，然后遍历该列表以查找匹配项。例如：

set fh [open foo]
set lines [read $fh]
close $fh

set lines [split $lines "\n"]
foreach line $lines {
    if { [regexp {.*/(\S+\.c)\s*!!\s*(\d+)} $line match file data] } {
        puts "$file $data"
    }
}

这将成功返回带有“!!”的行在他们中。使用您发布的语料库，结果是：

atmp2.c 75
btmp2.c 85

score 0 · Accepted Answer

诀窍是将从文件中读取行的代码与检测匹配行并提取相关部分的正则表达式结合起来（使用的一步过程regexp）。唯一棘手的部分是弄清楚究竟要使用什么作为正则表达式，这样你才能得到你想要的。我猜你在 . 之后的文件名部分之后/，这些文件名不会包含空格，并且你所追求的数字是双感叹号后的第一个数字序列的全部。（其他格式也是可能的，其中一些更容易使用其他工具提取，例如scan.）这会给我们这样的结果：

set f [open abc.txt]
while {[gets $f line] >= 0} {
    if {[regexp {([^\s/]+)\s+!!\s+(\d+)} $line -> name value]} {
        # Or do whatever you want with these
        puts "$name $value"
    }
}
close $f

（gets带有两个参数的命令返回读取行的长度，或者-1失败。对于普通文件，唯一的失败模式是 EOF，所以我们可以在得到负值时终止循环。其他类型的通道可能更复杂...... )

score 0 · Accepted Answer

在这种情况下，我可能会想 exec 到 awk：

set output [exec awk {$2 == "!!" {print $1, $3}} abc.txt]
puts $output

tcl - 在文件中查找搜索字符串并使用这些行在 TCL 中进行处理

4 回答 4

Related

Reference