1

I have a script I run nightly to get the ammt of stuff stored in a specific directory on my server. This is the function I am using for that core part:

def get_size(start_path = '.'):
    total_size = 0
    for dirpath, dirnames, filenames in os.walk(start_path):
        for f in filenames:
            try:
                fp = os.path.join(dirpath, f)
                total_size += os.path.getsize(fp)
                print str(total_size)+" bytes / "+str(size(total_size))+" counted"+" <------------ current position: "+start_path+" : "+f
                for location in locations_dict:
                    if locations_dict[location][1] != "":
                        print str(location)+": "+str(size(locations_dict[location][1]))
            except OSError, e:
                print e
    return total_size

For some reason, I am getting a different value when I manually run

$ du -hc [path to dir]

With the Python I get 20551043874445 bytes (converts to 20.5 TB). With du I get 28 TB (I am re-running now without -h to get the value in bytes).

Clearly that Python function is missing something, but I'm not sure what or how. Any ideas?

4

1 回答 1

8

du shows the size in 512-byte blocks. If the file size isn't a multiple of 512, du rounds up. To get the equivalent value in Python, instead of using os.path.getsize(), use os.stat() and use the st_blocks attribute of the result.

total_size += os.stat(fp).st_blocks * 512;
于 2016-02-11T20:43:44.187 回答