bash - 将 Fortran 记录预处理为类型的 Shell 脚本？

Question

我正在将一些旧的 F77 代码转换为在 gfortran 下编译。我有一堆以下列方式使用的记录：

RecoRD /TEST/ this
this.field = 1
this.otherfield.sumthin = 2
func = func(%val(ThIs.field,foo.bar,this.other.field))

我正在尝试将这些全部转换为 TYPE：

TYPE(TEST) this
this%field = 1
this%otherfield%sumthin = 2
func = func(%val(ThIs%field,foo.bar,this%other%field))

我对 sed 没问题，我可以处理文件以用 TYPE 声明替换 RECORD 声明，但是有没有办法使用 linux 工具编写预处理类型的脚本来将 this.field 表示法转换为 this%field 表示法？我相信我需要一些可以识别声明的记录名称并专门针对它的东西，以避免意外地影响其他变量。另外，知道如何处理包含的文件吗？我觉得这可能会变得非常混乱，但如果有人做过类似的事情，最好包含在解决方案中。

编辑：我有 python 2.4 可用。

score 2 · Accepted Answer

您可以为此使用 Python。以下脚本从标准输入读取文本并使用您要求的替换将其输出到标准输出：

import re
import sys

txt = sys.stdin.read()
names = re.findall(r"RECORD /TEST/\s*\b(.+)\b", txt, re.MULTILINE)
for name in list(set(names)):
    txt = re.sub(r"\b%s\.(.*)\b"%name, r"%s%%\1"%name, txt, 
                 re.MULTILINE)
sys.stdout.write(txt)

编辑：至于 Python 2.4：是的格式应该替换为 %。至于具有子字段的结构，可以通过在sub()调用中使用如下函数轻松实现。我还添加了不区分大小写：

import re
import sys

def replace(match):
    return match.group(0).replace(".", "%")

txt = sys.stdin.read()
names = re.findall(r"RECORD /TEST/\s*\b(.+)\b", txt, re.MULTILINE)
for name in names:
    txt = re.sub(r"\b%s(\.\w+)+\b" % name, replace, txt,
                 re.MULTILINE | re.IGNORECASE)
sys.stdout.write(txt)

score 1 · Accepted Answer

使用 GNU awk：

$ cat tst.awk
/RECORD/ { $0 = gensub(/[^/]+[/]([^/]+)[/]/,"TYPE(\\1)",""); name=tolower($NF) }
{
   while ( match(tolower($0),"\\<" name "[.][[:alnum:]_.]+") ) {
      $0 = substr($0,1,RSTART-1) \
           gensub(/[.]/,"%","g",substr($0,RSTART,RLENGTH)) \
           substr($0,RSTART+RLENGTH)
   }
}
{ print }

$ cat file
RECORD /TEST/ tHiS
this.field = 1
THIS.otherfield.sumthin = 2
func = func(%val(ThIs.field,foo.bar,this.other.field))

$ awk -f tst.awk file
TYPE(TEST) tHiS
this%field = 1
THIS%otherfield%sumthin = 2
func = func(%val(ThIs%field,foo.bar,this%other%field))

请注意，我修改了您的输入以显示this.field在一行上多次出现并与其他“。”混合会发生什么。参考文献（foo.bar）。我还添加了一些混合大小写的“this”来展示它是如何工作的。

针对以下有关如何处理包含文件的问题，这是一种方法：

该脚本不仅会扩展所有“包含子文件”的行，还会将结果写入 tmp 文件，重置 ARGV[1]（最高级别的输入文件）而不重置 ARGV[2]（tmp 文件），然后它让 awk 对扩展结果进行任何正常的记录解析，因为它现在存储在 tmp 文件中。如果您不需要，只需对标准输出执行“打印”并删除对 tmp 文件或 ARGV[2] 的任何其他引用。

awk 'function read(file) {
       while ( (getline < file) > 0) {
           if ($1 == "include") {
                read($2)
           } else {
                print > ARGV[2]
           }
       }
       close(file)
   }
   BEGIN{
      read(ARGV[1])
      ARGV[1]=""
      close(ARGV[2])
   }1' a.txt tmp

给定当前目录中的这 3 个文件，运行上述结果：

  a.txt             b.txt              c.txt
  -----             -----              -----
  1                 3                  5
  2                 4                  6
  include b.txt     include c.txt
  9                 7
  10                8

将打印数字 1 到 10 并将它们保存在名为“tmp”的文件中。

因此，对于此应用程序，您可以将上述脚本末尾的数字“1”替换为上面发布的第一个脚本的内容，它适用于现在包含扩展文件内容的 tmp 文件。

bash - 将 Fortran 记录预处理为类型的 Shell 脚本？

2 回答 2

Related

Reference