2

我正在下载一个带有hyper的 XZ 文件,我想通过从每个传入的文件中尽可能多地提取Chunk并立即将结果写入磁盘,而不是先下载整个文件然后解压缩,从而将其以解压缩的形式保存到磁盘。

有实现 XZ 格式的xz2 crate。但是,它XzDecoder似乎不支持类似Python 的decompressobj模型,在这种模型中,调用者反复提供部分输入并获得部分输出。

相反,XzDecoder通过Read参数接收输入字节,我不知道如何将这两个东西粘合在一起。有没有办法喂ResponseXzDecoder

到目前为止我发现的唯一线索是这个问题,它包含对私有ReadableChunks类型的引用,理论上我可以在我的代码中复制它 - 但也许有更简单的方法?

4

2 回答 2

1

根据@Laney 的回答,我想出了以下工作代码:

extern crate failure;
extern crate hyper;
extern crate tokio;
extern crate xz2;

use std::fs::File;
use std::io::Write;
use std::u64;

use failure::Error;
use futures::future::done;
use futures::stream::Stream;
use hyper::{Body, Chunk, Response};
use hyper::rt::Future;
use hyper_tls::HttpsConnector;
use tokio::runtime::Runtime;

fn decode_chunk(file: &mut File, xz: &mut xz2::stream::Stream, chunk: &Chunk)
                -> Result<(), Error> {
    let end = xz.total_in() as usize + chunk.len();
    let mut buf = Vec::with_capacity(8192);
    while (xz.total_in() as usize) < end {
        buf.clear();
        xz.process_vec(
            &chunk[chunk.len() - (end - xz.total_in() as usize)..],
            &mut buf,
            xz2::stream::Action::Run)?;
        file.write_all(&buf)?;
    }
    Ok(())
}

fn decode_response(mut file: File, response: Response<Body>)
                   -> impl Future<Item=(), Error=Error> {
    done(xz2::stream::Stream::new_stream_decoder(u64::MAX, 0)
        .map_err(Error::from))
        .and_then(|mut xz| response
            .into_body()
            .map_err(Error::from)
            .for_each(move |chunk| done(
                decode_chunk(&mut file, &mut xz, &chunk))))
}

fn main() -> Result<(), Error> {
    let client = hyper::Client::builder().build::<_, hyper::Body>(
        HttpsConnector::new(1)?);
    let file = File::create("hello-2.7.tar")?;
    let mut runtime = Runtime::new()?;
    runtime.block_on(client
        .get("https://ftp.gnu.org/gnu/hello/hello-2.7.tar.xz".parse()?)
        .map_err(Error::from)
        .and_then(|response| decode_response(file, response)))?;
    runtime.shutdown_now();
    Ok(())
}
于 2019-02-24T19:56:25.037 回答
1

XzDecoder does not seem to support a Python-like decompressobj model, where a caller repeatedly feeds partial input and gets partial output

there's xz2::stream::Stream which does exactly what you want. Very rough untested code, needs proper error handling, etc, but I hope you'll get the idea:

fn process(body: hyper::body::Body) {
    let mut decoder = xz2::stream::Stream::new_stream_decoder(1000, 0).unwrap();
    body.for_each(|chunk| {
        let mut buf: Vec<u8> = Vec::new();
        if let Ok(_) = decoder.process_vec(&chunk, &mut buf, Action::Run) {
            // write buf to disk
        }
        Ok(())
    }).wait().unwrap();
}
于 2019-02-21T15:11:00.473 回答