ruby - find whether a zipped file is text or binary without unzipping it

Question

I'm creating a ruby script which goes through several zip files and validates the content of any xml files within. To optimise my script, I'm using the ruby-zip gem to open the zip files without extracting them.

My initial thought was to use filemagic to determine the MIME-type of the files, but the filemagic gem takes a file path and all I have are these Entry and InputStream classes which are unique to ruby-zip.

Is there a good way to determine the filetype without extracting? Ultimately I need to identify xml files, but I can get away with identifying plain-text files and using a regex to look for the

score 2 · Accepted Answer

filemagic gem 采用文件路径

filemagic gem 的file方法采用文件路径，但file不是唯一的方法。看一眼文档就会发现它也有一个io方法。

我所拥有的就是这些 ruby-zip 独有的 Entry 和 InputStream 类

我不会说 InputStream 是“ruby-zip 独有的”。从文档（强调我的）：

InputStream 继承 IOExtras::AbstractInputStream 以便提供一个类似 IO 的接口来读取单个 zip 条目

所以 FileMagic 有一个io方法，而 Zip::InputStream 类似于 IO。这使我们得到了一个非常简单的解决方案：

require 'filemagic'
require 'zip'

Zip::InputStream.open('/path/to/file.zip') do |io|
  entry = io.get_next_entry

  FileMagic.open(:mime) do |fm|
    p fm.io(entry.get_input_stream)
  end
end

ruby - find whether a zipped file is text or binary without unzipping it

1 回答 1

Related

Reference