目标:将大文件上传到 AWS Glacier,而不会将整个文件保存在内存中。

我目前正在使用 fs.readFileSync() 上传到冰川,一切正常。但是,我需要处理大于 4GB 的文件,并且我想并行上传多个块。这意味着转向分段上传。我可以选择块大小,但冰川需要每个块的大小相同(最后一个除外)



假设我能做到这一点,我将使用集群,其中有几个进程尽可能快地从流中拉出,因为它们可以上传到 AWS。如果这似乎是并行工作的错误方式,我会喜欢那里的建议。


3 回答 3



var CHUNK_SIZE = 10 * 1024 * 1024, // 10MB
    buffer = Buffer.alloc(CHUNK_SIZE),
    filePath = '/tmp/foo';

fs.open(filePath, 'r', function(err, fd) {
  if (err) throw err;
  function readNextChunk() {
    fs.read(fd, buffer, 0, CHUNK_SIZE, null, function(err, nread) {
      if (err) throw err;

      if (nread === 0) {
        // done reading file, do any necessary finalization steps

        fs.close(fd, function(err) {
          if (err) throw err;

      var data;
      if (nread < CHUNK_SIZE)
        data = buffer.slice(0, nread);
        data = buffer;

      // do something with `data`, then call `readNextChunk();`
于 2014-08-04T02:36:47.050 回答

您可以考虑使用下面的片段,我们以 1024 字节的块读取文件

var fs = require('fs');

var data = '';

var readStream = fs.createReadStream('/tmp/foo.txt',{ highWaterMark: 1 * 1024, encoding: 'utf8' });

readStream.on('data', function(chunk) {
    data += chunk;
    console.log('chunk Data : ')
    console.log(chunk);// your processing chunk logic will go here

}).on('end', function() {
// here you see all data processed at end of file

请注意:highWaterMark 是用于块大小的参数希望这有帮助!

网络参考:https : //stackabuse.com/read-files-with-node-js/ 更改读取流块大小

于 2019-12-25T00:21:08.617 回答

根据mscdex 的回答,这是一个使用同步替代方案和 StringDecoder 来正确解析 UTF-8 的模块

问题readableStream在于,为了使用它,您必须将整个项目转换为使用异步发射器和回调。如果您正在编写一些简单的代码,例如 nodejs 中的小型 CLI,那么它没有任何意义。

let file = new UTF8FileReader()
file.open('./myfile.txt', 1024) 
while ( file.isOpen ) {
    let stringData=file.readChunk()

// UTF8FileReader.ts
import * as fs from 'fs';
import { StringDecoder, NodeStringDecoder } from "string_decoder";

export class UTF8FileReader {

    filename: string;
    isOpen: boolean = false;
    private chunkSize: number;
    private fd: number; //file handle from fs.OpenFileSync
    private readFilePos: number;
    private readBuffer: Buffer;

    private utf8decoder: NodeStringDecoder

     * open the file | throw
     * @param filename
    open(filename, chunkSize: number = 16 * 1024) {

        this.chunkSize = chunkSize;

        try {
            this.fd = fs.openSync(filename, 'r');
        catch (e) {
            throw new Error("opening " + filename + ", error:" + e.toString());

        this.filename = filename;
        this.isOpen = true;

        this.readBuffer = Buffer.alloc(this.chunkSize);
        this.readFilePos = 0;

        //a StringDecoder is a buffered object that ensures complete UTF-8 multibyte decoding from a byte buffer
        this.utf8decoder = new StringDecoder('utf8')


     * read another chunk from the file 
     * return the decoded UTF8 into a string
     * (or throw)
     * */
    readChunk(): string {

        let decodedString = '' //return '' by default

        if (!this.isOpen) {
            return decodedString;

        let readByteCount: number;
        try {
            readByteCount = fs.readSync(this.fd, this.readBuffer, 0, this.chunkSize, this.readFilePos);
        catch (e) {
            throw new Error("reading " + this.filename + ", error:" + e.toString());

        if (readByteCount) {
            //some data read, advance readFilePos 
            this.readFilePos += readByteCount;
            //get only the read bytes (if we reached the end of the file)
            const onlyReadBytesBuf = this.readBuffer.slice(0, readByteCount);
            //correctly decode as utf8, and store in decodedString
            //yes, the api is called "write", but it decodes a string - it's a write-decode-and-return the string kind-of-thing :)
            decodedString = this.utf8decoder.write(onlyReadBytesBuf); 
        else {
            //read returns 0 => all bytes read
        return decodedString 

    close() {
        if (!this.isOpen) {
        this.isOpen = false;


如果您还没有打字稿,这里是 .js 转译的代码:

// UTF8FileReader.js
"use strict";
Object.defineProperty(exports, "__esModule", { value: true });
exports.UTF8FileReader = void 0;
// UTF8FileReader
const fs = require("fs");
const string_decoder_1 = require("string_decoder");
class UTF8FileReader {
    constructor() {
        this.isOpen = false;
     * open the file | throw
     * @param filename
    open(filename, chunkSize = 16 * 1024) {
        this.chunkSize = chunkSize;
        try {
            this.fd = fs.openSync(filename, 'r');
        catch (e) {
            throw new Error("opening " + filename + ", error:" + e.toString());
        this.filename = filename;
        this.isOpen = true;
        this.readBuffer = Buffer.alloc(this.chunkSize);
        this.readFilePos = 0;
        //a StringDecoder is a buffered object that ensures complete UTF-8 multibyte decoding from a byte buffer
        this.utf8decoder = new string_decoder_1.StringDecoder('utf8');
     * read another chunk from the file
     * return the decoded UTF8 into a string
     * (or throw)
     * */
    readChunk() {
        let decodedString = ''; //return '' by default
        if (!this.isOpen) {
            return decodedString;
        let readByteCount;
        try {
            readByteCount = fs.readSync(this.fd, this.readBuffer, 0, this.chunkSize, this.readFilePos);
        catch (e) {
            throw new Error("reading " + this.filename + ", error:" + e.toString());
        if (readByteCount) {
            //some data read, advance readFilePos 
            this.readFilePos += readByteCount;
            //get only the read bytes (if we reached the end of the file)
            const onlyReadBytesBuf = this.readBuffer.slice(0, readByteCount);
            //correctly decode as utf8, and store in decodedString
            //yes, the api is called "write", but it decodes a string - it's a write-decode-and-return the string kind-of-thing :)
            decodedString = this.utf8decoder.write(onlyReadBytesBuf);
        else {
            //read returns 0 => all bytes read
        return decodedString;
    close() {
        if (!this.isOpen) {
        this.isOpen = false;
exports.UTF8FileReader = UTF8FileReader;
于 2020-08-31T10:25:19.313 回答