不久前,我使用 Linux 的“tar -cf”命令压缩了一个应用程序。当时,一些文件名使用不同的语言。
现在,当我使用“tar -xf”解压缩时,它会将另一种语言的文件名显示为问号。
有没有办法在我解压缩时保持原始文件名不变?
非常感谢您的帮助。
Good question ! It's expected that like any Unix command, tar could pipe its output to another program, if possible including filename data. A quick googling reveals that this is the case: as described in this blog post, GNU tar supports the --to-command parameter to write the output to a pipe, instead of directly operating on the directory.
http://osmanov-dev-notes.blogspot.com.br/2010/07/how-to-handle-filename-encodings-in.html
So it's a matter of writing a script to convert the filename to UTF-8, like it's done in the cited post. Another option, also described in the text, that becomes obvious after you read it is to simply extract everything and then write a script to convert every file in the directory. There's a trivial php script in the link that does this.
Finally, you can always write your own custom tar version with the help of scripting languages, and that's easy. Python, for example has the tarfile module built in the standard library:
http://docs.python.org/2/library/tarfile.html#examples
You could use TarFile.extractfile(), shutils.copyfileobj() and str.decode() in a loop to manually extract the files while changing the file name encoding.
References:
http://www.gnu.org/software/tar/manual/tar.html#SEC84