bash - 根本不理解 dd 命令参数

Question

我对 dd 命令非常熟悉，但我很少需要自己使用它。今天我需要，但我遇到的行为似乎很奇怪。

我想创建一个 100M 的文本文件，其中每一行都包含一个单词“testing”。这是我的第一次尝试：

~$ perl -e 'print "testing\n" while 1' | dd of=X bs=1M count=100
0+100 records in
0+100 records out
561152 bytes (561 kB) copied, 0.00416429 s, 135 MB/s

嗯，这很奇怪。其他组合呢？

~$ perl -e 'print "testing\n" while 1' | dd of=X bs=100K count=1K
0+1024 records in
0+1024 records out
4268032 bytes (4.3 MB) copied, 0.0353145 s, 121 MB/s

~$ perl -e 'print "testing\n" while 1' | dd of=X bs=10K count=10K
86+10154 records in
86+10154 records out
42524672 bytes (43 MB) copied, 0.35403 s, 120 MB/s

~$ perl -e 'print "testing\n" while 1' | dd of=X bs=1K count=100K
102400+0 records in
102400+0 records out
104857600 bytes (105 MB) copied, 0.879549 s, 119 MB/s

因此，在这四个表面上等效的命令中，都生成了不同大小的文件，其中只有一个是我所期望的。这是为什么？

编辑：顺便说一句，我有点尴尬，我没有想到“是的测试”而不是那个更长的 Perl 命令。

score 9 · Accepted Answer

要了解发生了什么，让我们看看strace类似调用的输出：

execve("/bin/dd", ["dd", "of=X", "bs=1M", "count=2"], [/* 72 vars */]) = 0
…
read(0, "testing\ntesting\ntesting\ntesting\n"..., 1048576) = 69632
write(1, "testing\ntesting\ntesting\ntesting\n"..., 69632) = 69632
read(0, "testing\ntesting\ntesting\ntesting\n"..., 1048576) = 8192
write(1, "testing\ntesting\ntesting\ntesting\n"..., 8192) = 8192
close(0)                                = 0
close(1)                                = 0
write(2, "0+2 records in\n0+2 records out\n", 31) = 31
write(2, "77824 bytes (78 kB) copied", 26) = 26
write(2, ", 0.000505796 s, 154 MB/s\n", 26) = 26
…

发生的情况是dd一次调用read()来读取每个块。这在从磁带读取时是合适的，这是dd最初的主要用途。在磁带上，read真的读了一个块。从文件读取时，您必须注意不要指定太大的块大小，否则read将被截断。从管道读取时，情况更糟：您读取的块的大小将取决于生成数据的命令的速度。

这个故事的寓意是不要用来dd复制数据，除非是安全的小块。并且永远不会来自管道，除非使用bs=1.

（GNU dd 有一个fullblock标志告诉它表现得体面。但其他实现没有。）

score 8 · Accepted Answer

我还不确定为什么，但使用这种方法在保存之前不会填满整个块。尝试：

perl -e 'print "testing\n" while 1' | dd of=output.txt bs=10K count=10K iflag=fullblock
10240+0 records in
10240+0 records out
104857600 bytes (105 MB) copied, 2.79572 s, 37.5 MB/s

iflag=fullblock似乎强制 dd 累积输入直到块已满，尽管我不确定为什么这不是默认设置，或者默认情况下它实际上做了什么。

score 3 · Accepted Answer

我最好的猜测是dd从管道中读取，当它为空时，它假定它读取了整个块。结果非常不一致：

$ perl -e 'print "testing\n" while 1' | dd of=X bs=1M count=100
0+100 records in
0+100 records out
413696 bytes (414 kB) copied, 0.0497362 s, 8.3 MB/s

user@andromeda ~
$ perl -e 'print "testing\n" while 1' | dd of=X bs=1M count=100
0+100 records in
0+100 records out
409600 bytes (410 kB) copied, 0.0484852 s, 8.4 MB/s

bash - 根本不理解 dd 命令参数

3 回答 3

Related

Reference