0

我想理想地从特定列中删除 csv 文件中的空格,只留下单个空格而不是其他不需要的空格。我有以下脚本可以实现此目的,但在实现以下脚本以检查特定列下的目标 csv 并删除空格时需要帮助。

这是脚本:

'Start by trimming leading/trailing spaces
str = Trim(str)

'Now, while we have 2 consecutive spaces, replace them
'with a single space...
Do While InStr(1, str, "  ")
str = Replace(str, "  ", " ")
Loop

理想情况下,我想这样调用脚本:

Cscript whitespaceremover.vbs target.csv 'column_name'
4

3 回答 3

3

我认为我下面的示例可以完善,但我希望它足以开始。

我的演示 CSV 文件“target.csv”:

column_name1,column_name2,column_name3
abc 123, dfr 1145  wse, ht6
axv 358, dgt 2245  ekl, x7r
amn 772, fxw 7633  foo, pmn

一个示例“whitespaceremover.vbs”:

Const ForReading = 1, ForWriting = 2
Dim fso, file, column
Set fso = CreateObject("Scripting.FileSystemObject")
With WScript.Arguments
    If .Count <> 2 Then
        WScript.Echo "Error: Needs to arguments."
        WScript.Quit(-1)
    End If
    file   = .Item(0)
    column = .Item(1)
End With
If Not fso.FileExists(file) Then
    WScript.Echo "Error: File " & UCase(file) & " not found."
    WScript.Quit(-2)
End If
Dim csvFile, csvHeader, iColumn, idx
Set csvFile = fso.OpenTextFile(file, ForReading)
If Not csvFile.AtEndOfStream Then
    csvHeader = Split(csvFile.ReadLine, ",", -1, 1)
Else
    WScript.Echo "Error: File " & UCase(file) & " is empty."
    csvFile.Close
    WScript.Quit(-3)
End If
iColumn = -1
For idx = 0 To UBound(csvHeader)
    If csvHeader(idx) = column Then
        iColumn = idx
        Exit For
    End If
Next
If iColumn < 0 Then
    WScript.Echo "Error: column " & UCase(column) & " not found."
    csvFile.Close
    WScript.Quit(-4)
End If
Dim csvFile2, arLine, strLine
Set csvFile2 = fso.OpenTextFile(file & ".csv", ForWriting, True)
csvFile2.WriteLine Join(csvHeader, ",")
Do Until csvFile.AtEndOfStream
    strLine = Trim(csvFile.ReadLine)
    arLine  = Split(strLine, ",", -1, 1)
    Do While InStr(1, arLine(iColumn), "  ")
        arLine(iColumn) = Replace(arLine(iColumn), "  ", " ")
    Loop
    strLine = Join(arLine, ",")
    csvFile2.WriteLine strLine
Loop
csvFile.Close
csvFile2.Close
Set csvFile  = Nothing
Set csvFile2 = Nothing
Set fso = Nothing

结果(新文件“target.csv.csv”):

column_name1,column_name2,column_name3
abc 123, dfr 1145 wse, ht6
axv 358, dgt 2245 ekl, x7r
amn 772, fxw 7633 foo, pmn

PS一些我忘记发布的注释。为了测试简单,我在第二列放了双倍空格。很快,要查看实际使用的脚本column_name2作为命令行参数,即:

Cscript whitespaceremover.vbs target.csv column_name2

编辑

在阅读了 Ansgar Wiechers 关于Replace功能的评论后,我最终决定进行一些测试。与正则表达式相比,上面的代码可能很慢,但它可以工作。这是我的证明示例:

str1 = "1" & Space(2) & "2" & Space(4) & "3" _
    & Space(1) & "4" & Space(6) & "5"
WScript.Echo "Original string: ", str1
Do While InStr(1, str1, "  ")
    str1 = Replace(str1, "  ", " ")
Loop
WScript.Echo "New string: ", str1
'Result>>
'Original string:  1  2    3 4      5
'New string:  1 2 3 4 5
于 2013-06-09T21:07:30.580 回答
1

作为对Panayot Karabakalov提供的答案的补充:当存在 3 个或更多连续空格的序列时,简单地用一个空格替换两个空格可能会产生不希望的结果。像这样的一行:

foo   bar      baz

将被替换为:

foo  bar   baz

不要这样:

foo bar baz

原因是在替换字符串之后Replace继续。例如,运行将首先替换前 2 个字符:Replace("aaaa", "aa", "a")a

aaaaaaa

然后替换替换字符串a的下 2 个字符:

aaaaa

然后终止。

折叠空格(或一般的字符序列)的更强大的解决方案是用正则表达式替换:

Set re = New RegExp
re.Pattern = " +"  '<-- means "a sequence of one or more spaces"
re.Global = True

text = "foo   bar      baz"

WScript.Echo re.Replace(text, " ")

输出:

foo bar baz
于 2013-06-10T17:41:49.077 回答
0

它不是面对所有的指导方针,但也许它会对某人有所帮助。

' USAGE: CScript WhiteSpaceRemover.vbs Target_File.csv Column_Number

Set oArgs = WScript.Arguments
If oArgs.Count = 2 Then
    strInputFileName = oArgs(0)
    intColumn = oArgs(1) - 1
    strOutputFileName = PrepareOutputPath(strInputFileName, "_new")

    WriteTextFile strOutputFileName, TrimCsv(ReadTextFile(strInputFileName), intColumn)

End If
Set oArgs = Nothing

Function TrimCsv (strFileContent, intColumn)
    ' usuwa niepotrzebne spacje w polach tabeli CSV
    strFileContent = Replace(strFileContent, vbCrLf, vbLf)
    arrFileContent = Split(strFileContent, vbLf)
    strFileContent = ""
    For Each strLine in arrFileContent
        If Not Len(strLine) = 0 Then
            arrRecord = Split(strLine, ";")

'           for specified column number
            arrRecord(intColumn) = Trim(arrRecord(intColumn))

'           for all columns
'           For iCount = LBound(arrRecord) To UBound(arrRecord)
'               arrRecord(iCount) = Trim(arrRecord(iCount))
'           Next

            AddToList strFileContent, Join(arrRecord, ";"), vbCrLf
            Erase arrRecord
        End If
    Next
    TrimCsv = strFileContent
    Erase arrFileContent
End Function


Function PrepareOutputPath(strFileName, strSuffix)
    Set objFSO = CreateObject("Scripting.FileSystemObject")
    With objFSO
        strPath = .GetParentFolderName(strFileName)
        strName = .GetBaseName(strFileName)
        strExt = .GetExtensionName(strFileName)
    End With
    PrepareOutputPath = AddToList(strPath, strName & strSuffix, "\")
    PrepareOutputPath = AddToList(PrepareOutputPath, strExt, ".")
    Set objFSO = Nothing
End Function


Function AddToList(strList, strValue, strDelim)
    ' add delimiter between values
    If strList = "" Then
        AddToList = strValue
    Else
        AddToList = strList & strDelim & strValue
    End If
    strList = AddToList
End Function 


Function ReadTextFile(strFileName)
    Set objStream = CreateObject("ADODB.Stream")
    objStream.CharSet = "utf-8"

    objStream.Open
    objStream.LoadFromFile(strFileName)
    ReadTextFile = objStream.ReadText()
    objStream.Close

    Set objStream = Nothing
End Function


Sub WriteTextFile (strFileName, strFileContent)
    adSaveCreateNotExist = 1
    adSaveCreateOverWrite = 2
    adWriteChar = 0
    adWriteLine = 1

    Set objStream = CreateObject("ADODB.Stream")
    objStream.CharSet = "utf-8"

    objStream.Open
    objStream.WriteText strFileContent, adWriteChar
    objStream.SaveToFile strFileName, adSaveCreateOverwrite
    objStream.Close

    Set objStream = Nothing
End Sub

此致

--

帕维尔·L。

于 2015-10-30T09:29:27.567 回答