0

Python 2.4 对于我的示例,我有一个 2 列 csv 文件

例如:

HOST, FILE
server1, /path/to/file1
server2, /path/to/file2
server3, /path/to/file3

我想在 PATH 中为 csv 文件中的每一行获取对象的文件大小,然后将该值添加到新列上的 csv 文件中。进行中:

 HOST, PATH, FILESIZE
 server1, /path/to/file1, 6546542
 server2, /path/to/file2, 46546343
 server3, /path/to/file3, 87523

我尝试了几种方法,但没有取得很大的成功。

下面的代码在 PATH 上执行 fileSizeCmd (du -b) 并正确输出 filezie,但我还没有弄清楚如何使用数据添加到 csv 文件

 import datetime
 import csv
 import os, time
 from subprocess import Popen, PIPE, STDOUT

 now = datetime.datetime.now()
 fileSizeCmd = "du -b"
 SP = " "

 # Try to get disk size and append to another row after entry above
 #st = os.stat(row[3])
 #except IOError:
 #print "failed to get information about", file
 #else:
 #print "file size:", st[ST_SIZE]
 #print "file modified:", time.asctime(time.localtime(st[ST_MTIME]))

 incsv = open('my_list.csv', 'rb')
 try:
     reader = csv.reader(incsv)
     outcsv = open('results/results_' + now.strftime("%m-%d-%Y") + '.csv', 'wb')
     try:
         writer = csv.writer(outcsv)

         for row in reader:
         p = Popen(fileSizeCmd + SP + row[1], shell=True, stdin=PIPE, stdout=PIPE, stderr=PIPE)
         stdout, empty = p.communicate()


         print 'Command: %s\nOutput: %s\n' % (fileSizeCmd + SP + row[1], stdout)

         #  Results in bytes example
         #
         #  Output:
         #  8589935104      /path/to/file
         #

     #  Write 8589935104 to new column of csv FILE

   finally:
      outcsv.close()

 finally:
incsv.close()
4

3 回答 3

1

没有错误处理的草图:

#!/usr/bin/env python

import csv
import os

filename = "sample.csv"
# localhost, 01.html.bak
# localhost, 01.htmlbak
# ...

def filesize(filename):
    # no need to shell out for filesize
    return os.stat(filename).st_size

with open(filename, 'rb') as handle:
    reader = csv.reader(handle)
    # result is written to sample.csv.updated.csv
    writer = csv.writer(open('%s.updated.csv' % filename, 'w'))
    for row in reader:
        # need to strip filename, just in case
        writer.writerow(row + [ filesize(row[1].strip()) ])

# result
# localhost, 01.html.bak,10021
# localhost, 01.htmlbak,218982
# ...
于 2012-04-13T20:52:12.533 回答
0

你可以

1) 将 cvs 内容读入 (server, filename) 的元组列表

2) 收集此列表中每个元素的文件大小

3)将结果打包到另一个元组(服务器,文件名,文件大小)到另一个列表('result')中

4) 将结果写入新文件

于 2012-04-13T20:48:20.957 回答
0

首先,获取文件大小比使用容易得多subprocess(参见os.stat):

>>> os.stat('/tmp/file').st_size
100

其次,您的writer对象写入不同的文件是正确的,但您只需要在row从 中返回的列表中添加一列reader,然后将它们提供给writerowwriter请参见此处)。像这样的东西:

>>> writerfp = open('out.csv', 'w')
>>> writer = csv.writer(writerfp)
>>> for row in csv.reader(open('in.csv', 'r')):
...     row.append('column')
...     writer.writerow(row)
...
>>> writerfp.close()
于 2012-04-13T20:50:19.980 回答