0

在 Python 上,我需要帮助来搜索具有特定标头类型(PST 文件,标头序列 21 42 44 4E)的文件,然后将它们复制到我保存的文件目录中。

以下是我的代码的相关摘录。

# get working directory of my program
ori_path=os.getcwd()
# this is where the file is saved(from the root directory of my program)
temp = "\etc\Saved Files"
# obtain value written in textbox, the path to search
path_to_copy_from=self.textbox.GetValue()
# create the absolute destination path so that the files don't end up somewhere they shouldn't be  
copy_path="{}{}".format(ori_path,temp)     
# change working directory to the one specified by user
os.chdir(path_to_copy_from)

我将使用 shutil 进行这样的复制:

shutil.copy(files,copy_path)

我发现使用 itertools 提到了一些搜索,但我无法理解该示例(因此,我为什么要问这个问题)。我需要帮助来编写将查看文件头的代码,然后如果头与 PST 头格式匹配则调用 shutil。

4

2 回答 2

0

使用 junuxx 的解决方案,我想出了一种方法来对目录中的所有文件执行此操作。最终代码如下所示

     import wx #this is for GUI interfaces i made to interact with the user. aka wxPython
     import os
     import glob
     import shutil
     #the above are the needed libraries. 
     path_to_copy_from=self.textbox.GetValue() #obtain value written in textbox(made with wxPython)
     if os.path.exists(path_to_copy_from):
        ori_path=os.getcwd()#get working directory of my program
        #combine ori_path and temp to create the destination path for files to be copied to
        copy_path="{}{}".format(ori_path,temp)
        #change working directory to the one specified by user
        os.chdir(path_to_copy_from)
            #to copy files based on header
            header = ""
            pst_header="21 42 44 4e "
            for self.files in glob.glob("*.*"):
                try:
                    with open(self.files, 'rb') as f:
                        for i in range(4):
                            byte = f.read(1)
                            header += hex(ord(byte))[2:] + " "

                        if header == pst_header:
                            shutil.copy(self.files,copy_path)
                            #the following 2 lines tells the user using a textbox i made earlier that something is happening. made for a textctrl i made with wxPython
                            self.textbox2.AppendText("Found file with .pst header.\n")
                            self.textbox2.AppendText("Copied {} to {}. \n".format(self.files,copy_path))
                            #to change copied file to read only
                            path_to_file="{}\{}".format(copy_path,self.files)
                            #set the file as read-only(for my program only, not necessary to have)
                            os.chmod(path_to_file,0444)
                        #to remove the string already in header for next iteration
                        header = "" 
                #simple exception handling. change as you need
                except TypeError, ex:
                    pass
                except IOError, ex:
                    pass

非常感谢 :)

于 2013-10-08T00:42:57.637 回答
0

要从您的问题中获取十六进制格式文件的前 4 个字节:

header = ""
with open(path_to_copy_from, 'rb') as f:
    for i in range(4):
        byte = f.read(1)
        header += hex(ord(byte))[2:] + " "

然后,您可以检查文件夹中的每个文件,该header字符串是否与您要查找的字符串匹配。

于 2013-10-07T09:53:47.060 回答