0

我使用以下脚本将 csvFolder 中的所有 csv 文件与这些条件相结合

  1. 删除第一行(我使用 'grep -v "words"' 来实现它)
  2. 提取 18-21 列
  3. 输出到文件夹/test.csv

当 csv 文件为 la 时,运行它需要很长时间。请建议我以获得更好的性能。

我在编写bat文件方面很新,请解释一下。谢谢!!

这是我使用的 bat 脚本。

for /f "tokens=18-21 delims=," %%a in ('cat csvFolder/*.csv') do echo %%a,%%b,%%c,%%d | grep -v "words" >> folder/test.csv

这是示例 csv 文件。

"(PDH-CSV 4.0) (GMT words g)(0)","\\s-s\s(c)\% t g","\\s-s\s(I)\% t g","\\s-s\s(Iace)\% t g","\\s-s\s(Nr)\% t g","\\s-s\s(Rface)\% t g","\\s-s\s(c)\y u","\\s-s\s(I)\y u","\\s-s\s(Iace)\y u","\\s-s\s(Nr)\y u","\\s-s\s(Rface)\y u","\\s-s\s(c)\p Bytes","\\s-s\s(I)\p Bytes","\\s-s\s(Iace)\p Bytes","\\s-s\s(Nr)\p Bytes","\\s-s\s(Rface)\p Bytes","\\s-s\s(c)\q Set","\\s-s\s(I)\q Set","\\s-s\s(Iace)\q Set","\\s-s\s(Nr)\q Set","\\s-s\s(Rface)\q Set","\\s-s\Memory\% Committed Bytes In Use","\\s-s\Memory\Available MBytes","\\s-s\t(0)\% j g","\\s-s\t(1)\% j g","\\s-s\t(2)\% j g","\\s-s\t(3)\% j g","\\s-s\t(4)\% j g","\\s-s\t(5)\% j g","\\s-s\t(6)\% j g","\\s-s\t(7)\% j g","\\s-s\t(8)\% j g","\\s-s\t(9)\% j g","\\s-s\t(10)\% j g","\\s-s\t(11)\% j g","\\s-s\t(12)\% j g","\\s-s\t(13)\% j g","\\s-s\t(14)\% j g","\\s-s\t(15)\% j g","\\s-s\t(16)\% j g","\\s-s\t(17)\% j g","\\s-s\t(18)\% j g","\\s-s\t(19)\% j g","\\s-s\t(20)\% j g","\\s-s\t(21)\% j g","\\s-s\t(22)\% j g","\\s-s\t(23)\% j g","\\s-s\t(24)\% j g","\\s-s\t(25)\% j g","\\s-s\t(26)\% j g","\\s-s\t(27)\% j g","\\s-s\t(28)\% j g","\\s-s\t(29)\% j g","\\s-s\t(30)\% j g","\\s-s\t(31)\% j g","\\s-s\t(32)\% j g","\\s-s\t(33)\% j g","\\s-s\t(34)\% j g","\\s-s\t(35)\% j g","\\s-s\t(36)\% j g","\\s-s\t(37)\% j g","\\s-s\t(38)\% j g","\\s-s\t(39)\% j g","\\s-s\t(40)\% j g","\\s-s\t(41)\% j g","\\s-s\t(42)\% j g","\\s-s\t(43)\% j g","\\s-s\t(44)\% j g","\\s-s\t(45)\% j g","\\s-s\t(46)\% j g","\\s-s\t(47)\% j g","\\s-s\t(_Total)\% j g","\\s-s\t(0)\% t g","\\s-s\t(1)\% t g","\\s-s\t(2)\% t g","\\s-s\t(3)\% t g","\\s-s\t(4)\% t g","\\s-s\t(5)\% t g","\\s-s\t(6)\% t g","\\s-s\t(7)\% t g","\\s-s\t(8)\% t g","\\s-s\t(9)\% t g","\\s-s\t(10)\% t g","\\s-s\t(11)\% t g","\\s-s\t(12)\% t g","\\s-s\t(13)\% t g","\\s-s\t(14)\% t g","\\s-s\t(15)\% t g","\\s-s\t(16)\% t g","\\s-s\t(17)\% t g","\\s-s\t(18)\% t g","\\s-s\t(19)\% t g","\\s-s\t(20)\% t g","\\s-s\t(21)\% t g","\\s-s\t(22)\% t g","\\s-s\t(23)\% t g","\\s-s\t(24)\% t g","\\s-s\t(25)\% t g","\\s-s\t(26)\% t g","\\s-s\t(27)\% t g","\\s-s\t(28)\% t g","\\s-s\t(29)\% t g","\\s-s\t(30)\% t g","\\s-s\t(31)\% t g","\\s-s\t(32)\% t g","\\s-s\t(33)\% t g","\\s-s\t(34)\% t g","\\s-s\t(35)\% t g","\\s-s\t(36)\% t g","\\s-s\t(37)\% t g","\\s-s\t(38)\% t g","\\s-s\t(39)\% t g","\\s-s\t(40)\% t g","\\s-s\t(41)\% t g","\\s-s\t(42)\% t g","\\s-s\t(43)\% t g","\\s-s\t(44)\% t g","\\s-s\t(45)\% t g","\\s-s\t(46)\% t g","\\s-s\t(47)\% t g","\\s-s\t(_TL)\% t g"
"02/04/2014 02:25:19.850","0","0","173.29978448754693","0","0","122","3357","5634","3279","2933","51122176","1887068160","377069568","1403805696","141160448","7012352","1668734976","282546176","641404928","98045952","43.227033512749721","14578","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","100","100","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","100","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","100","100","100","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","42.233405170817683","49.454183237426584"
"02/04/2014 02:25:20.839","0","0","115.12529196882903","0","6.3082351763741924","122","3357","5634","3279","2933","51122176","1887068160","377069568","1403805696","141160448","7012352","1668734976","282546176","641404928","98045952","43.226920632400869","14578","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1.5770587940935481","0","0","0","1.5770587940935481","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0.065710361866962802","0","2.2223547662000076","0","0","0","0","0","0","0","0","0","0","8.5305899425741956","0","0","0","0","0","0.64529597210645218","5.3764723543870963","5.3764723543870963","0","0","0","10.107648736667752","19.57000150122904","32.18647185397743","19.57000150122904","13.26176632485484","17.992942707135484","0","8.5305899425741956","2.2223547662000076","0.64529597210645218","3.7994135602935519","0.64529597210645218","0","2.2223547662000076","0.64529597210645218","3.7994135602935519","3.7994135602935519","2.2223547662000076","8.5305899425741956","6.9535311484806517","3.7994135602935519","6.9535311484806517","8.5305899425741956","8.5305899425741956","3.8979791030939959"
"02/04/2014 02:25:21.845","0","1.550550710336521","103.8868975925469","0","21.707709944711297","122","3357","5634","3279","2933","51122176","1887068160","377069568","1403805696","141160448","7012352","1668734976","282546176","641404928","98045952","43.226983013646283","14581","0","1.550550710336521","1.550550710336521","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1.550550710336521","0","0","0","1.550550710336521","0","0","0","0","0","0","0","0","1.550550710336521","0","0","1.550550710336521","0","0","0","1.550550710336521","0","0","0.2261205291001715","0.76475453846265307","5.4164066694722184","3.8658559591357","0.76475453846265307","0.76475453846265307","0.76475453846265307","0.76475453846265307","0.76475453846265307","0.76475453846265307","0.76475453846265307","0.76475453846265307","0.76475453846265307","14.719710931491347","0.76475453846265307","0.76475453846265307","0.76475453846265307","0.76475453846265307","0.76475453846265307","3.8658559591357","30.22521803485655","3.8658559591357","0.76475453846265307","0.76475453846265307","0.76475453846265307","17.820812352164385","24.023015193510467","13.169160221154819","16.270261641827865","16.270261641827865","19.371363062500901","2.315305248799171","3.8658559591357","5.4164066694722184","5.4164066694722184","3.8658559591357","0.76475453846265307","5.4164066694722184","3.8658559591357","3.8658559591357","3.8658559591357","3.8658559591357","3.8658559591357","5.4164066694722184","3.8658559591357","2.315305248799171","5.4164066694722184","8.5175080901452649","11.618609510818301","5.5456184003865978"
"02/04/2014 02:25:22.853","0","0","92.848453249913092","0","12.379793766655082","122","3357","5634","3279","2933","51122176","1887068160","377069568","1403805696","141160448","7012352","1668734976","282546176","641404928","98045952","43.227161245776045","14579","0","1.5474742208318852","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1.5474742208318852","0","0","0","1.5474742208318852","0","0","0","1.5474742208318852","0","0","0","0","0","0","0.12895535843241074","0","7.1515467500869008","0.96164986675935094","0","0","0","0.96164986675935094","0","0","0","0","0","7.1515467500869008","2.5091240875912413","0","0","0","0","0","11.79396941258255","0.96164986675935094","0","0","0","14.888917854246319","11.79396941258255","4.056598308423121","16.43639207507821","17.98386629591009","11.79396941258255","2.5091240875912413","2.5091240875912413","0.96164986675935094","0.96164986675935094","0","0","0","0.96164986675935094","2.5091240875912413","0","2.5091240875912413","0.96164986675935094","4.056598308423121","0.96164986675935094","2.5091240875912413","4.056598308423121","5.6040725292550109","8.6990209709187809","2.8315224033152231"
"02/04/2014 02:25:23.848","0","1.5692552674079339","116.12488978818712","0","9.4155316044476045","122","3357","5634","3279","2933","51122176","1887068160","377069568","1403805696","141160448","7012352","1668734976","282546176","641404928","98045952","43.227161245776045","14580","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1.1369181533001704","0","0","0","0","0","0","0","0","0","0","15.260215559971568","0","1.1369181533001704","0","0","0","1.1369181533001704","8.9831944903398409","0","0","0","0","12.121705025155705","18.398726094787442","10.552449757747773","15.260215559971568","16.82947082737951","16.82947082737951","1.1369181533001704","1.1369181533001704","2.7061734207081023","7.4139392229318979","5.8446839555239656","7.4139392229318979","2.7061734207081023","2.7061734207081023","8.9831944903398409","4.2754286881160342","5.8446839555239656","1.1369181533001704","0","2.7061734207081023","7.4139392229318979","12.121705025155705","5.8446839555239656","7.4139392229318979","4.079262977833908"
"02/04/2014 02:25:24.844","0","0","108.05231529511674","0","17.225731423859191","122","3357","5634","3279","2933","51122176","1887068160","377069568","1403805696","141160448","7012352","1668734976","282546176","641404928","98045952","43.22384909869794","14580","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1.5659755839871992","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0.032624282203052524","1.3435382088064496","6.0414649607680504","1.3435382088064496","1.3435382088064496","1.3435382088064496","1.3435382088064496","1.3435382088064496","1.3435382088064496","1.3435382088064496","1.3435382088064496","1.3435382088064496","1.3435382088064496","13.871342880704042","6.0414649607680504","4.4754893767808497","1.3435382088064496","1.3435382088064496","1.3435382088064496","6.0414649607680504","13.871342880704042","1.3435382088064496","1.3435382088064496","1.3435382088064496","1.3435382088064496","10.73939171272964","15.437318464691241","17.003294048678441","18.569269632665641","13.871342880704042","15.437318464691241","6.0414649607680504","4.4754893767808497","4.4754893767808497","4.4754893767808497","2.9095137927936499","2.9095137927936499","4.4754893767808497","2.9095137927936499","2.9095137927936499","9.1734161287424403","6.0414649607680504","2.9095137927936499","2.9095137927936499","7.6074405447552396","4.4754893767808497","4.4754893767808497","6.0414649607680504","15.437318464691241","5.4216035989100515"
4

2 回答 2

1

如果我必须这样做,显而易见的解决方案是awk

awk -F , -v OFS=, "NR>1{print $18,$19,$20,$21}" "csvfolder\*.csv" > "folder\output.csv"

说了这么多,让我们用批处理来解决它。但无论如何,这会很慢。

for /f用于处理命令的输出时,行为是首先检索所有数据,然后开始对其进行处理。而且,当涉及大量数据时,这真的很慢。

for /f当命令处理磁盘上的文件时,这种行为不太明显。该文件已在内存中完全读取以开始工作,但加载时间要快得多。

这是违反直觉的,但是当处理大文件时,生成一个仅包含所需行的中间临时文件然后使用for. 当然,如果至少中间文件位于本地硬盘上,它会快很多。

set "tempFile=%temp%\csv.tmp"
for %%z in (csvFolder\*.csv) do (
    echo %%z
    findstr /v "words" "%%~fz" > "%tempFile%"
    (for /f "usebackq tokens=18-21 delims=," %%a in ("%tempFile%") do echo %%a,%%b,%%c,%%d) >> "folder\test.csv"
)
del "%tempFile%" >nul 2>nul
于 2014-02-13T10:45:46.900 回答
1

awk 或 sed 听起来是你最好的选择。

但我确实设法找到了一个性能相当不错的纯原生 Windows 脚本解决方案。它使用我的REPL.BAT 混合 JScript/batch 实用程序执行正则表达式搜索并在标准输入行上替换并将结果写入标准输出。完整的文档嵌入在脚本中。

假设 REPL.BAT 在您的 PATH 中的某个位置,那么下面的命令行一个行应该可以解决问题:

findstr /v words *.csv | repl ".*?:(?:.*?,){17}((?:.*?,){3}.*?),.*" $1 >folder\output.csv

我在 20 个总大小超过 1GB 的 CSV 文件上测试了上述内容,并在 80 秒内成功完成。时间应该与总文件大小成线性关系。

请注意,.*?:正则表达式中的初始表达式与 findstr 在每行之前插入的文件名前缀匹配。

另请注意,每个源 CSV 文件在文件的最后一行末尾都有一个换行符是至关重要的。如果不是,则文件的最后一行将与下一个文件的第一行合并。

于 2014-02-13T14:04:39.360 回答