2

我有一组这样的数据

7859 10000:00 7859 10000:00 (xfer#1, to-check=1033/1035)

32768 000:17 22174479 10000:00(xfer#2,待检查=1032/1035)

它们是从文件中读取并逐行传递给我的批处理脚本中的方法我想要在该方法中做的只是提取

7859

22174479

从这行来看,基本上是“\d+:\d\d\s+”之后的任何内容,然后是我需要的数字,然后是另一个“\d\d.*”

这是否可能仅使用批处理脚本正则表达式和搜索和替换?我尝试并阅读了一堆文章但找不到解决方案我想添加数字

谢谢

编辑
根据 Andrei 对 David Ruhmann 的回答的评论,Andrei 想要之前 2 个位置的令牌(xfer#,而不是从一开始的第 3 个令牌。

4

4 回答 4

0

根据您对 David Ruhmann 的回答的评论,您需要在(xfer#字符串前 2 个位置的标记。我想它可以使用本机批处理命令来完成,但这是一个讨厌的问题。

我假设您仅限于 Windows 本机的命令 - 没有下载的可执行文件。

我希望您可以使用 JScript,因为它是 Windows 原生的。

我编写了一个名为“REPL.BAT”的混合 JScript/Batch 实用程序脚本,它执行正则表达式搜索和替换。尽管不需要太多代码,但它是一个非常有用的实用程序。该实用程序使解决方案非常简单。

我使用 FINDSTR 过滤掉在(xfer#. 我将这些结果通过管道传输到我的 REPL 实用程序并仅保留所需的令牌。结果被发送到标准输出。

findstr /r /c:" [^ ][^ ]* [^ ][^ ]* (xfer#" test.txt | repl ".* ([^ ]+) ([^ ]+) \(xfer#.*" "$1"

这是 REPL.BAT 实用程序脚本的代码。完整的文档嵌入在脚本中。

@if (@X)==(@Y) @end /* Harmless hybrid line that begins a JScript comment

::************ Documentation ***********
:::
:::REPL  Search  Replace  [Options  [SourceVar]]
:::REPL  /?
:::
:::  Performs a global search and replace operation on each line of input from
:::  stdin and prints the result to stdout.
:::
:::  Each parameter may be optionally enclosed by double quotes. The double
:::  quotes are not considered part of the argument. The quotes are required
:::  if the parameter contains a batch token delimiter like space, tab, comma,
:::  semicolon. The quotes should also be used if the argument contains a
:::  batch special character like &, |, etc. so that the special character
:::  does not need to be escaped with ^.
:::
:::  If called with a single argument of /? then prints help documentation
:::  to stdout.
:::
:::  Search  - By default this is a case sensitive JScript (ECMA) regular
:::            expression expressed as a string.
:::
:::            JScript syntax documentation is available at
:::            http://msdn.microsoft.com/en-us/library/ae5bf541(v=vs.80).aspx
:::
:::  Replace - By default this is the string to be used as a replacement for
:::            each found search expression. Full support is provided for
:::            substituion patterns available to the JScript replace method.
:::            A $ literal can be escaped as $$. An empty replacement string
:::            must be represented as "".
:::
:::            Replace substitution pattern syntax is documented at
:::            http://msdn.microsoft.com/en-US/library/efy6s3e6(v=vs.80).aspx
:::
:::  Options - An optional string of characters used to alter the behavior
:::            of REPL. The option characters are case insensitive, and may
:::            appear in any order.
:::
:::            I - Makes the search case-insensitive.
:::
:::            L - The Search is treated as a string literal instead of a
:::                regular expression. Also, all $ found in Replace are
:::                treated as $ literals.
:::
:::            E - Search and Replace represent the name of environment
:::                variables that contain the respective values. An undefined
:::                variable is treated as an empty string.
:::
:::            M - Multi-line mode. The entire contents of stdin is read and
:::                processed in one pass instead of line by line. ^ anchors
:::                the beginning of a line and $ anchors the end of a line.
:::
:::            X - Enables extended substitution pattern syntax with support
:::                for the following escape sequences:
:::
:::                \\     -  Backslash
:::                \b     -  Backspace
:::                \f     -  Formfeed
:::                \n     -  Newline
:::                \r     -  Carriage Return
:::                \t     -  Horizontal Tab
:::                \v     -  Vertical Tab
:::                \xnn   -  Ascii (Latin 1) character expressed as 2 hex digits
:::                \unnnn -  Unicode character expressed as 4 hex digits
:::
:::                Escape sequences are supported even when the L option is used.
:::
:::            S - The source is read from an environment variable instead of
:::                from stdin. The name of the source environment variable is
:::                specified in the next argument after the option string.
:::

::************ Batch portion ***********
@echo off
if .%2 equ . (
  if "%~1" equ "/?" (
    findstr "^:::" "%~f0" | cscript //E:JScript //nologo "%~f0" "^:::" ""
    exit /b 0
  ) else (
    call :err "Insufficient arguments"
    exit /b 1
  )
)
echo(%~3|findstr /i "[^SMILEX]" >nul && (
  call :err "Invalid option(s)"
  exit /b 1
)
cscript //E:JScript //nologo "%~f0" %*
exit /b 0

:err
>&2 echo ERROR: %~1. Use REPL /? to get help.
exit /b

************* JScript portion **********/
var env=WScript.CreateObject("WScript.Shell").Environment("Process");
var args=WScript.Arguments;
var search=args.Item(0);
var replace=args.Item(1);
var options="g";
if (args.length>2) {
  options+=args.Item(2).toLowerCase();
}
var multi=(options.indexOf("m")>=0);
var srcVar=(options.indexOf("s")>=0);
if (srcVar) {
  options=options.replace(/s/g,"");
}
if (options.indexOf("e")>=0) {
  options=options.replace(/e/g,"");
  search=env(search);
  replace=env(replace);
}
if (options.indexOf("l")>=0) {
  options=options.replace(/l/g,"");
  search=search.replace(/([.^$*+?()[{\\|])/g,"\\$1");
  replace=replace.replace(/\$/g,"$$$$");
}
if (options.indexOf("x")>=0) {
  options=options.replace(/x/g,"");
  replace=replace.replace(/\\\\/g,"\\B");
  replace=replace.replace(/\\b/g,"\b");
  replace=replace.replace(/\\f/g,"\f");
  replace=replace.replace(/\\n/g,"\n");
  replace=replace.replace(/\\r/g,"\r");
  replace=replace.replace(/\\t/g,"\t");
  replace=replace.replace(/\\v/g,"\v");
  replace=replace.replace(/\\x[0-9a-fA-F]{2}|\\u[0-9a-fA-F]{4}/g,
    function($0,$1,$2){
      return String.fromCharCode(parseInt("0x"+$0.substring(2)));
    }
  );
  replace=replace.replace(/\\B/g,"\\");
}
var search=new RegExp(search,options);

if (srcVar) {
  WScript.Stdout.Write(env(args.Item(3)).replace(search,replace));
} else {
  while (!WScript.StdIn.AtEndOfStream) {
    if (multi) {
      WScript.Stdout.Write(WScript.StdIn.ReadAll().replace(search,replace));
    } else {
      WScript.Stdout.WriteLine(WScript.StdIn.ReadLine().replace(search,replace));
    }
  }
}
于 2013-02-13T18:26:18.733 回答
0

请注意,批处理不是用于正则表达式的最佳语言!Cmd 一次处理一行输入,而 regex 允许多行处理。

听起来您只需要从线路中执行令牌抓取。假设该行的更完整的正则表达式看起来像这样[\d+\s+\d+:\d\d\s+]+\(xfer#\d+, to-check=\d+/\d+\)

这让我们知道行中有常量分隔符。:冒号和\s+空格。从那里只需使用这些锚来确定令牌位置。


从该行中提取由单行空格分隔的第三个标记。

for /f "tokens=3" %%A in ("line") do echo %%A

从行中由冒号分隔的第二个标记中提取由单行空格分隔的第二个标记。

for /f "tokens=2 delims=:" %%A in ("line") do (
    for /f "tokens=2" %%B in ("%%A") do echo %%B
)

更新

提取最后一个冒号之前的第二个标记。

@echo off
setlocal EnableExtensions EnableDelayedExpansion
set "Line=32768 004:47 2686976 2200:03 11707819 10000:01 (xfer#5264, to-check=1020/6975)"

set "Last="
for /f "delims=" %%A in ('echo("%Line::="^&echo("%"') do (
    for /f "tokens=2" %%B in ("%%A") do (
        if defined This set "Last=!This!"
        set "This=%%B"
    )
)
echo %Last%

endlocal
pause >nul

限制

  1. 包含奇数个双引号的"行将导致脚本崩溃。防止这种情况的一种方法是在 for 循环之前用set Line=%Line:"=%.
于 2013-02-13T15:10:53.647 回答
0

完成您想要的最简单和最灵活的方法是使用GnuWin32 中的awk(正则表达式示例)或sed(例如sed -i -r -e "s/(\d+:\d\d\s+)\d+/\1replacementstring/g" filename:),它们都支持 Perl 正则表达式语法。我认为您所参与的正是 awk 的设计目的。

如果您只使用可用的东西而不必使用 3rd 方工具,您可以使用 vbscript 执行正则表达式匹配。您可以通过将脚本回显到 .vbs 文件、调用 cscript vbsfile 并捕获其输出来调用 vbscript。这是一个概念证明。

@echo off & setlocal enabledelayedexpansion

:: rxp.bat
:: rxp /? for usage instructions

if #%4==# goto usage
set global=false
set replace=false
for %%I in (%*) do (
    if not #!next!==# (
        if !next!==string set string=%%I
        if !next!==pattern set pattern=%%I
        if !next!==replace set replace=%%I
        set next=
    )
    if #%%I==#/s set next=string
    if #%%I==#/p set next=pattern
    if #%%I==#/r set next=replace
    if #%%I==#/g set global=true
)
if #%string==# goto usage
if #%pattern==# goto usage

set string=!string:"=""!
set string=!string:\=!
set pattern=!pattern:"=""!
set pattern=!pattern:\=!
if #!replace!==#false (
    call :rxp !string:~1,-1! !pattern:~1,-1! !global!
) else (
    set replace=!replace:"=""!
    set replace=!replace:\=!
    call :rxp !string:~1,-1! !pattern:~1,-1! !global! !replace:~1,-1!
)
goto :EOF

:rxp string pattern global replacement
echo Set rxp = New RegExp>regexp.vbs
echo rxp.Pattern = %2>>regexp.vbs
echo rxp.Global = %3>>regexp.vbs
if #%4==# (
    echo Set res = rxp.Execute^(%1^)>>regexp.vbs
    echo For Each match in res>>regexp.vbs
    echo Wscript.Echo match.value>>regexp.vbs
    echo Next>>regexp.vbs
) else (
    echo Wscript.echo rxp.Replace^(%1, %4^)>>regexp.vbs
)
cscript /nologo regexp.vbs
del /q regexp.vbs
goto :EOF

:usage
echo Usage: %~nx0 /s "string" /p "regexp" [/g] [/r "replacement text"]
echo;
echo    /s -- search string
echo;
echo    /p -- regular expression pattern
echo          Example: /p "<[^>]+>" to search for markup tags
echo          matches ^<span class='a'^> or similar
echo;
echo    /r -- replacement text (optional)
echo          If specified, replace the matched text
echo          Example: /p "(<div class=')blue('>)" /r "$1red$2"
echo          matches ^<div class='blue'^>
echo          replaces match with ^<div class='red'^>
echo;
echo    /g -- global match (optional)
echo          match every occurrence (matches only the first by default)
echo;
echo notes: If the regexp pattern includes capturing parentheses, use ^$1-^$9 as
echo backreferences in your replacement text.  If any of your strings include
echo quotation marks, they can be escaped with a backslash (\).
echo;
echo Example:
echo %~nx0 /s "text begin <div id=\"foo\"> text end" /p "(<div)[^>]+(>)"
echo /r "$1 class=\"bar\"$2"
echo;
echo matches ^<div id="foo"^>, replaces match with ^<div class="bar"^>
echo output: text begin ^<div class="bar"^> text end

示例输出:

C:\Users\me\Desktop>rxp /s "7859 10000:00 7849 10000:00 (xfer#1, to-check=1033/1035)" /p "(\d+:\d\d\s+)\d+" /r "$1foo"
7859 10000:00 foo 10000:00 (xfer#1, to-check=1033/1035)

C:\Users\me\Desktop>rxp
Usage: rxp.bat /s "string" /p "regexp" [/g] [/r "replacement text"]

   /s -- search string

   /p -- regular expression pattern
         Example: /p "<[^>]+>" to search for markup tags
         matches <span class='a'> or similar

   /r -- replacement text (optional)
         If specified, replace the matched text

   /g -- global match (optional)
         match every occurrence (matches only the first by default)

notes: If the regexp pattern includes capturing parentheses, use $1-$9 as
backreferences in your replacement text.  If any of your strings include
quotation marks, they can be escaped with a backslash (\).

Example:
rxp.bat /s "text begin <div id=\"foo\"> text end" /p "(<div)[^>]+(>)"
/r "$1 class=\"bar\"$2"

matches <div id="foo">, replaces match with <div class="bar">
output: text begin <div class="bar"> text end
于 2013-02-13T15:18:04.100 回答
0
  :: Does %variable% =~ s/old/new/
  setlocal ENABLEDELAYEDEXPANSION     
  for /f "delims=" %%a in ('echo !variable! ^|perl -pe "s/regexp/replace/" ') do set variable=%%a  
于 2017-04-03T02:53:52.320 回答