Assuming coreutils
du
and cp
.
When cp
copies a file, it tries to preserve its "sparseness" using heuristics.
By default, sparse SOURCE files are detected by a crude heuristic and the corresponding DEST file is made sparse as well.
So if the heuristic fails, cp
will create a plain file, without holes. In that case, the disk usage for the copy will be larger than the disk usage for the source – but the apparent file size should be identical, and the contents should be identical (try cmp
).
Use stat
to see both the apparent size and the disk usage for files (plus lots more info).
$ dd if=/dev/zero of=./sparse bs=1 count=1 seek=10240000
1+0 records in
1+0 records out
1 byte (1 B) copied, 1.4101e-05 s, 70.9 kB/s
$ cp sparse copy1
$ cp --sparse=never sparse copy2
$ ll
-rw-r--r-- 1 me users 10240001 Apr 28 17:59 copy1
-rw-r--r-- 1 me users 10240001 Apr 28 18:00 copy2
-rw-r--r-- 1 me users 10240001 Apr 28 17:59 sparse
$ du sparse copy*
4 sparse
4 copy1
10004 copy2
$ stat sparse copy*
File: `sparse'
Size: 10240001 Blocks: 8 IO Block: 4096 regular file
...
File: `copy1'
Size: 10240001 Blocks: 8 IO Block: 4096 regular file
...
File: `copy2'
Size: 10240001 Blocks: 20008 IO Block: 4096 regular file
$ cmp sparse copy1 && echo identical
identical
$ cmp sparse copy2 && echo identical
identical