0

我有一个测试文件:

0000 850 1300    Pump  4112 893 2400    Installing sleeve  5910 890 2202    Installing tool 
Testing crankcase and  Protecting oil seal  Installing crankshaft 
carburetor for leaks  (starter side)  5910 890 2208    Installing tool, 8 
0000 855 8106    Sealing plate  4112 893 2401    Press sleeve  Installing hookless 
Sealing exhaust port  Installing oil seal  snap rings in piston 
0000 855 9200    Nipple  (clutch side)  5910 890 2301    Screwdriver, T20 
Testing carburetor for         4118 890 6400    Setting gauge  Separating handle 
leaks  Setting air gap  moldings 
0000 890 1701    Testing tool kit  between ignition  5910 890 2400    Screwdriver, T27x150 
0000 893 2600    Clamping strap  module and flywheel  For all IS screw

我只想打印:

0000 850 1300
4112 893 2400
5910 890 2202
5910 890 2208
0000 855 8106
.
.
.

感谢您的帮助。

编辑:

文件中的数字位于不同的位置。这些数字随机放置在输入文件中。每个数字的格式为:

xxxx xxx xxxx 

编辑-1:

我尝试了两种方法,但它不适用于 mawk:

pic@pic:~/Pulpit$ mawk --traditional -f script.awk infile
mawk: not an option: --traditional
pic@pic:~/Pulpit$ mawk -f script.awk infile
pic@pic:~/Pulpit$ 
4

3 回答 3

2

一种方法grep(如果您的版本支持该-P标志):

grep -oP "[0-9]{4} [0-9]{3} [0-9]{4}" file.txt

输出:

0000 850 1300
4112 893 2400
5910 890 2202
5910 890 2208
0000 855 8106
4112 893 2401
0000 855 9200
5910 890 2301
4118 890 6400
0000 890 1701
5910 890 2400
0000 893 2600

高温高压

于 2012-06-25T12:46:56.207 回答
1

一种使用方式awk

假设infile您的问题中提供了内容:

内容script.awk

{
    ## Traverse all words of the line but last two. I assume to print three 
    ## consecutive number fields.
    i = 1 
    while ( i <= NF - 3 ) { 

        ## Set current word position in line.
        j = i 

        ## Get next word while current one is a digit, and save it to print later.
        while ( $j ~ /^[[:digit:]]+$/ ) { 
            value[j] = $j
            ++j 
        }   

        ## If found three consecutive number fields, print them and update counter of
        ## words in the line.
        if ( i + 3 == j ) { 
            for ( key in value ) { 
                printf "%s ", value[key]
            }   
            printf ORS 
            i += 3
        }   
        else {
            ## Failed the search, go to next field and try again.
            ++i 
        }   

        ## Delete array where I save numbers.
        # delete value           <--- Commented for compatibility with older versions.
        for ( key in value ) { 
            delete value[key]
        }
    }   
}

像这样运行它:

awk -f script.awk infile

具有以下输出:

0000 850 1300 
4112 893 2400 
5910 890 2202 
5910 890 2208 
0000 855 8106 
4112 893 2401 
0000 855 9200 
5910 890 2301 
4118 890 6400 
0000 890 1701 
5910 890 2400 
0000 893 2600
于 2012-06-25T12:47:50.980 回答
1

这更短,并寻找特定的模式:

mawk '
    BEGIN {
        d = "[0-9]"
    }; 
    {
        offset = 1; 
        while (RSTART + RLENGTH < length($0)) {
            if (! match(substr($0, offset), d d d d " " d d d " " d d d d)) {
                next
            }; 
            print substr($0, RSTART+offset - 1, RLENGTH); 
            offset = RSTART + RLENGTH + offset
        }
    }' inputfile
于 2012-06-25T18:27:50.727 回答