4

我想使用批处理文件插入一个字符串来替换特定列中的空格说我有一个input.txt如下

field1      field2           field3
AAAAA       BBBBB            CCCCC
DDDDD                        EEEEE
FFFFF                        
GGGGG       HHHHH 

我需要在每个为空的字段上插入一个字符串“NULL”,并确保字段 1 不为空,并且字段 2,3有时会为空。此外,field1 &field2 之间的空间与 field2 & field 3 不同

输出.txt

field1      field2           field3
AAAAA       BBBBB            CCCCC
DDDDD       NULL             EEEEE
FFFFF       NULL             NULL    
GGGGG       HHHHH            NULL

因为我仍然需要批处理文件脚本.. 我尝试编写代码(字段 2 始终从左侧 12 个字符开始,字段 3 始终从左侧 29 个字符开始)

@echo off

set line= 
for /F in (input.txt)do
if "!line:~12" equ " " 
write "NULL"   >> (i am not sure whether this work)

if "!line:~29" equ " "
write "NULL"  

echo .>> output.txt

也许,任何人都可以纠正我的错误?谢谢!!

4

2 回答 2

1

正如所承诺的,这是 Python 中的解决方案。该程序适用于 Python 3.x 或 Python 2.7。如果您对编程非常陌生,我建议您使用 Python 3.x,因为我认为它更容易学习。你可以从这里免费获得 Python:http: //python.org/download/

Python的最新版本是3.2.3版;我建议你得到那个。

将 Python 代码保存在一个名为的文件中add_null.py并使用以下命令运行它:

python add_null.py input_file.txt output_file.txt

代码,有很多评论:

# import brings in "modules" which contain extra code we can use.
# The "sys" module has useful system stuff, including the way we can get
# command-line arguments.
import sys

# sys.argv is an array of command-line arguments.  We expect 3 arguments:
# the name of this program (which we don't care about), the input file
# name, and the output file name.
if len(sys.argv) != 3:
    # If we didn't get the right number of arguments, print a message and exit.
    print("Usage: python add_null.py <input_file> <output_file>")
    sys.exit(1)

# Unpack the arguments into variables.  Use '_' for any argument we don't
# care about.
_, input_file, output_file = sys.argv


# Define a function we will use later.  It takes two arguments, a string
# and a width.
def s_padded(s, width):
    if len(s) >= width:
        # if it is already wide enough, return it unchanged
        return s
    # Not wide enough!  Figure out how many spaces we need to pad it.
    len_padding = width - len(s)
    # Return string with spaces appended.  Use the Python "string repetition"
    # feature to repeat a single space, len_padding times.
    return s + ' ' * len_padding


# These are the column numbers we will use for splitting, plus a width.
# Numbers put together like this, in parentheses and separated by commas,
# are called "tuples" in Python.  These tuples are: (low, high, width)
# The low and high numbers will be used for ranges, where we do use the
# low number but we stop just before the high number.  So the first pair
# will get column 0 through column 11, but will not actually get column 12.
# We use 999 to mean "the end of the line"; if the line is too short, it will
# not be an error.  In Python "slicing", if the full slice can't be done, you
# just get however much can be done.
#
# If you want to cut off the end of lines that are too long, change 999 to
# the maximum length you want the line ever to have.  Longer than
# that will be chopped short by the "slicing".
#
# So, this tells the program where the start and end of each column is, and
# the expected width of the column.  For the last column, the width is 0,
# so if the last column is a bit short no padding will be added.  If you want
# to make sure that the lines are all exactly the same length, change the
# 0 to the width you want for the last column.
columns = [ (0, 12, 12), (12, 29, 17), (29, 999, 0) ]
num_columns = len(columns)

# Open input and output files in text mode.
# Use a "with" statement, which will close the files when we are done.
with open(input_file, "rt") as in_f, open(output_file, "wt") as out_f:
    # read the first line that has the field headings
    line = in_f.readline()
    # write that line to the output, unchanged
    out_f.write(line)

    # now handle each input line from input file, one at a time
    for line in in_f:
        # strip off only the line ending
        line = line.rstrip('\n')

        # start with an empty output line string, and append to it
        output_line = ''
        # handle each column in turn
        for i in range(num_columns):
            # unpack the tuple into convenient variables
            low, high, width = columns[i]
            # use "slicing" to get the columns we want
            field = line[low:high]
            # Strip removes spaces and tabs; check to see if anything is left.
            if not field.strip():
                # Nothing was left after spaces removed, so put "NULL".
                field = "NULL"

            # Append field to output_line.  field is either the original
            # field, unchanged, or else it is a "NULL".  Either way,
            # append it.  Make sure it is the right width.
            output_line += s_padded(field, width)

        # Add a line ending to the output line.
        output_line += "\n"
        # Write the output line to the output file.
        out_f.write(output_line)

运行此程序的输出:

field1      field2           field3
AAAAA       BBBBB            CCCCC
DDDDD       NULL             EEEEE
FFFFF       NULL             NULL
GGGGG       HHHHH            NULL
于 2012-06-15T03:45:25.680 回答
0

我认为您想要做的事情在 Microsoft“批处理”脚本中是不可能的。但是这里记录了一整套字符串运算符:

http://www.dostips.com/DtTipsStringManipulation.php

但是批处理文件很糟糕,我希望你可以使用更好的东西。如果您需要 Python 解决方案或 AWK,我可以为您提供帮助。

如果我是你,并且我真的打算在“批处理”脚本中执行此操作,我会使用~x,y列切片将每一行分成三个子字符串(x第一列在哪里,第二列在哪里y)。然后检查每个是否只是空格,对于只是空格的,替换为“NULL”。然后将子字符串重新连接成一个字符串,然后打印出来。在循环中执行此操作,您就有了程序。

于 2012-06-15T02:19:00.683 回答