112

在 Node.js 项目中,我试图从 S3 取回数据。

当我使用getSignedURL时,一切正常:

aws.getSignedUrl('getObject', params, function(err, url){
    console.log(url); 
}); 

我的参数是:

var params = {
              Bucket: "test-aws-imagery", 
              Key: "TILES/Level4/A3_B3_C2/A5_B67_C59_Tiles.par"

如果我将 URL 输出到控制台并将其粘贴到 Web 浏览器中,它会下载我需要的文件。

但是,如果我尝试使用getObject我会得到各种奇怪的行为。我相信我只是使用不正确。这是我尝试过的:

aws.getObject(params, function(err, data){
    console.log(data); 
    console.log(err); 
}); 

输出:

{ 
  AcceptRanges: 'bytes',
  LastModified: 'Wed, 06 Apr 2016 20:04:02 GMT',
  ContentLength: '1602862',
  ETag: '9826l1e5725fbd52l88ge3f5v0c123a4"',
  ContentType: 'application/octet-stream',
  Metadata: {},
  Body: <Buffer 01 00 00 00  ... > }

  null

所以看起来这工作正常。但是,当我在其中一个 s 上放置断点时console.log,我的 IDE (NetBeans) 会引发错误并拒绝显示数据的值。虽然这可能只是 IDE,但我决定尝试其他方式来使用getObject.

aws.getObject(params).on('httpData', function(chunk){
    console.log(chunk); 
}).on('httpDone', function(data){
    console.log(data); 
});

这不会输出任何东西。放一个断点表明代码永远不会到达任何一个console.logs。我也试过:

aws.getObject(params).on('success', function(data){
    console.log(data); 
});

但是,这也不会输出任何内容,并且放置断点表明console.log永远不会到达。

我究竟做错了什么?

4

9 回答 9

236

getObject()从 S3 API执行操作时,根据文档,文件的内容位于Body属性中,您可以从示例输出中看到。您应该拥有如下所示的代码

const aws = require('aws-sdk');
const s3 = new aws.S3(); // Pass in opts to S3 if necessary

var getParams = {
    Bucket: 'abc', // your bucket name,
    Key: 'abc.txt' // path to the object you're looking for
}

s3.getObject(getParams, function(err, data) {
    // Handle any error and exit
    if (err)
        return err;

  // No error happened
  // Convert Body from a Buffer to a String
  let objectData = data.Body.toString('utf-8'); // Use the encoding necessary
});

您可能不需要从data.Body对象创建新缓冲区,但如果需要,您可以使用上面的示例来实现。

@aws-sdk/client-s3(2021 更新)

自从我在 2016 年写下这个答案以来,亚马逊已经发布了一个新的 JavaScript SDK @aws-sdk/client-s3,. getObject()这个新版本通过始终返回一个承诺而不是通过.promise()被链接来选择加入来改进原始版本getObject()。除此之外,response.Body不再是一个Buffer但是,一个Readable|ReadableStream|Blob。这改变了对response.Data位的处理。这应该更高效,因为我们可以流式传输返回的数据,而不是将所有内容保存在内存中,但代价是实现起来有点冗长。

在下面的示例中,response.Body数据将流式传输到数组中,然后作为字符串返回。这是我原始答案的等效示例。或者,response.Body可以使用stream.Readable.pipe()HTTP 响应、文件或任何其他类型以stream.Writeable供进一步使用,这将是获取大型对象时更高效的方式。

如果您想使用 a Buffer,就像原始getObject()响应一样,可以通过包装responseDataChunksaBuffer.concat()而不是 using 来完成Array#join(),这在与二进制数据交互时会很有用。需要注意的是,由于Array#join()返回一个字符串,每个Buffer实例responseDataChunks都将Buffer.toString()隐式调用,并且utf8将使用默认编码。

const { GetObjectCommand, S3Client } = require('@aws-sdk/client-s3')
const client = new S3Client() // Pass in opts to S3 if necessary

function getObject (Bucket, Key) {
  return new Promise(async (resolve, reject) => {
    const getObjectCommand = new GetObjectCommand({ Bucket, Key })

    try {
      const response = await client.send(getObjectCommand)
  
      // Store all of data chunks returned from the response data stream 
      // into an array then use Array#join() to use the returned contents as a String
      let responseDataChunks = []

      // Handle an error while streaming the response body
      response.Body.once('error', err => reject(err))
  
      // Attach a 'data' listener to add the chunks of data to our array
      // Each chunk is a Buffer instance
      response.Body.on('data', chunk => responseDataChunks.push(chunk))
  
      // Once the stream has no more data, join the chunks into a string and return the string
      response.Body.once('end', () => resolve(responseDataChunks.join('')))
    } catch (err) {
      // Handle the error or throw
      return reject(err)
    } 
  })
}

@aws-sdk/client-s3文档链接

于 2016-04-29T17:40:11.390 回答
51

基于@peteb 的回答,但使用Promisesand Async/Await

const AWS = require('aws-sdk');

const s3 = new AWS.S3();

async function getObject (bucket, objectKey) {
  try {
    const params = {
      Bucket: bucket,
      Key: objectKey 
    }

    const data = await s3.getObject(params).promise();

    return data.Body.toString('utf-8');
  } catch (e) {
    throw new Error(`Could not retrieve file from S3: ${e.message}`)
  }
}

// To retrieve you need to use `await getObject()` or `getObject().then()`
const myObject = await getObject('my-bucket', 'path/to/the/object.txt');
于 2018-10-24T19:23:14.750 回答
9

对于寻找NEST JS TYPESCRIPT上述版本的人:

    /**
     * to fetch a signed URL of a file
     * @param key key of the file to be fetched
     * @param bucket name of the bucket containing the file
     */
    public getFileUrl(key: string, bucket?: string): Promise<string> {
        var scopeBucket: string = bucket ? bucket : this.defaultBucket;
        var params: any = {
            Bucket: scopeBucket,
            Key: key,
            Expires: signatureTimeout  // const value: 30
        };
        return this.account.getSignedUrlPromise(getSignedUrlObject, params);
    }

    /**
     * to get the downloadable file buffer of the file
     * @param key key of the file to be fetched
     * @param bucket name of the bucket containing the file
     */
    public async getFileBuffer(key: string, bucket?: string): Promise<Buffer> {
        var scopeBucket: string = bucket ? bucket : this.defaultBucket;
        var params: GetObjectRequest = {
            Bucket: scopeBucket,
            Key: key
        };
        var fileObject: GetObjectOutput = await this.account.getObject(params).promise();
        return Buffer.from(fileObject.Body.toString());
    }

    /**
     * to upload a file stream onto AWS S3
     * @param stream file buffer to be uploaded
     * @param key key of the file to be uploaded
     * @param bucket name of the bucket 
     */
    public async saveFile(file: Buffer, key: string, bucket?: string): Promise<any> {
        var scopeBucket: string = bucket ? bucket : this.defaultBucket;
        var params: any = {
            Body: file,
            Bucket: scopeBucket,
            Key: key,
            ACL: 'private'
        };
        var uploaded: any = await this.account.upload(params).promise();
        if (uploaded && uploaded.Location && uploaded.Bucket === scopeBucket && uploaded.Key === key)
            return uploaded;
        else {
            throw new HttpException("Error occurred while uploading a file stream", HttpStatus.BAD_REQUEST);
        }
    }
于 2019-10-07T09:16:41.637 回答
6

与上面的@ArianAcosta 非常相似的答案。除了我正在使用import(对于 Node 12.x 及更高版本),添加 AWS 配置并嗅探图像负载并将base64处理应用于return.

// using v2.x of aws-sdk
import aws from 'aws-sdk'

aws.config.update({
  accessKeyId: process.env.YOUR_AWS_ACCESS_KEY_ID,
  secretAccessKey: process.env.YOUR_AWS_SECRET_ACCESS_KEY,
  region: "us-east-1" // or whatever
})

const s3 = new aws.S3();

/**
 * getS3Object()
 * 
 * @param { string } bucket - the name of your bucket
 * @param { string } objectKey - object you are trying to retrieve
 * @returns { string } - data, formatted
 */
export async function getS3Object (bucket, objectKey) {
  try {
    const params = {
      Bucket: bucket,
      Key: objectKey 
    }

    const data = await s3.getObject(params).promise();

    // Check for image payload and formats appropriately
    if( data.ContentType === 'image/jpeg' ) {
      return data.Body.toString('base64');
    } else {
      return data.Body.toString('utf-8');
    }

  } catch (e) {
    throw new Error(`Could not retrieve file from S3: ${e.message}`)
  }
}
于 2021-06-04T17:29:51.073 回答
4

或者,您可以使用minio-js 客户端库 get-object.js

var Minio = require('minio')

var s3Client = new Minio({
  endPoint: 's3.amazonaws.com',
  accessKey: 'YOUR-ACCESSKEYID',
  secretKey: 'YOUR-SECRETACCESSKEY'
})

var size = 0
// Get a full object.
s3Client.getObject('my-bucketname', 'my-objectname', function(e, dataStream) {
  if (e) {
    return console.log(e)
  }
  dataStream.on('data', function(chunk) {
    size += chunk.length
  })
  dataStream.on('end', function() {
    console.log("End. Total size = " + size)
  })
  dataStream.on('error', function(e) {
    console.log(e)
  })
})

免责声明:我为Minio工作,它的开源、S3 兼容的对象存储是用 golang 编写的,客户端库可用于JavaPythonJsgolang

于 2016-05-01T06:27:01.587 回答
3

乍一看,您似乎没有做错任何事情,但您没有显示所有代码。当我第一次检查 S3 和 Node 时,以下内容对我有用:

var AWS = require('aws-sdk');

if (typeof process.env.API_KEY == 'undefined') {
    var config = require('./config.json');
    for (var key in config) {
        if (config.hasOwnProperty(key)) process.env[key] = config[key];
    }
}

var s3 = new AWS.S3({accessKeyId: process.env.AWS_ID, secretAccessKey:process.env.AWS_KEY});
var objectPath = process.env.AWS_S3_FOLDER +'/test.xml';
s3.putObject({
    Bucket: process.env.AWS_S3_BUCKET, 
    Key: objectPath,
    Body: "<rss><data>hello Fred</data></rss>",
    ACL:'public-read'
}, function(err, data){
    if (err) console.log(err, err.stack); // an error occurred
    else {
        console.log(data);           // successful response
        s3.getObject({
            Bucket: process.env.AWS_S3_BUCKET, 
            Key: objectPath
        }, function(err, data){
            console.log(data.Body.toString());
        });
    }
});
于 2016-04-29T16:22:15.113 回答
3

更新 (2022)

nodejs v17.5.0 添加了 Readable.toArray。如果此 API 在您的节点版本中可用。代码会很短:

const buffer = Buffer.concat(
    await (
        await s3Client
            .send(new GetObjectCommand({
                Key: '<key>',
                Bucket: '<bucket>',
            }))
    ).Body.toArray()
)

如果您使用的是 Typescript,则可以安全地将该.Body部分转换为Readable(其他类型ReadableStream并且Blob仅在浏览器环境中返回。此外,在浏览器中,Blob response.body在不支持时用于遗留 fetch API )

(...Body as Readable).toArray()

请注意:Readable.toArray是一个实验性(但很方便)的功能,请谨慎使用。

在此处输入图像描述

==============

原始答案

如果您使用的是 aws sdk v3,则 sdk v3 返回 nodejs Readable(准确地说,IncomingMessage扩展了 Readable)而不是 Buffer。

这是一个打字稿版本。请注意,这仅适用于节点,如果您从浏览器发送请求,请查看下面提到的博客文章中较长的答案。

import {GetObjectCommand, S3Client} from '@aws-sdk/client-s3'
import type {Readable} from 'stream'

const s3Client = new S3Client({
    apiVersion: '2006-03-01',
    region: 'us-west-2',
    credentials: {
        accessKeyId: '<access key>',
        secretAccessKey: '<access secret>',
    }
})
const response = await s3Client
    .send(new GetObjectCommand({
        Key: '<key>',
        Bucket: '<bucket>',
    }))
const stream = response.Body as Readable

return new Promise<Buffer>((resolve, reject) => {
    const chunks: Buffer[] = []
    stream.on('data', chunk => chunks.push(chunk))
    stream.once('end', () => resolve(Buffer.concat(chunks)))
    stream.once('error', reject)
})
// if readable.toArray() is support
// return Buffer.concat(await stream.toArray())

为什么我们必须投response.Body as Readable?答案太长了。有兴趣的读者可以在我的博文中找到更多信息。

于 2021-12-04T06:34:00.467 回答
-1

转换GetObjectOutput.BodyPromise<string>使用 node-fetch

在 aws-sdk-js-v3 @aws-sdk/client-s3 中,GetObjectOutput.BodyReadablenodejs 中的子类(特别是 的实例http.IncomingMessage),而不是aws-sdk v2Buffer中的 a ,因此会给您错误的结果“[对象对象]”。相反,转换为 a的最简单方法是构造一个 node-fetch ,它接受一个子类(或实例,或fetch 规范中的其他类型)并具有转换方法、、和。resp.Body.toString('utf-8')GetObjectOutput.BodyPromise<string>ResponseReadableBuffer.json().text().arrayBuffer().blob()

这也应该适用于 aws-sdk 和平台的其他变体(@aws-sdk v3 节点Buffer、v3 浏览器Uint8Array子类、v2 节点Readable、v2 浏览器ReadableStreamBlob

npm install node-fetch
import { Response } from 'node-fetch';
import * as s3 from '@aws-sdk/client-s3';

const client = new s3.S3Client({})
const s3Response = await client.send(new s3.GetObjectCommand({Bucket: '…', Key: '…'});
const response = new Response(s3Response.Body);

const obj = await response.json();
// or
const text = await response.text();
// or
const buffer = Buffer.from(await response.arrayBuffer());
// or
const blob = await response.blob();

参考:GetObjectOutput.Body文档node-fetchResponse文档node-fetchBody构造函数源minipass-fetchBody构造函数源

感谢kennu 对GetObjectCommand可用性问题的评论

于 2021-11-20T00:05:25.323 回答
-1

这是async/await版本

var getObjectAsync = async function(bucket,key) {
  try {
    const data = await s3
      .getObject({ Bucket: bucket, Key: key })
      .promise();
      var contents = data.Body.toString('utf-8');
      return contents;
  } catch (err) {
    console.log(err);
  }
}
var getObject = async function(bucket,key) {
    const contents = await getObjectAsync(bucket,key);
    console.log(contents.length);
    return contents;
}
getObject(bucket,key);
于 2021-04-27T17:10:50.340 回答