python - AWS将文件上传到错误的存储桶

Question

我正在使用 AWS Sagemaker 并尝试将数据文件夹从 Sagemaker 上传到 S3。我想做的是将我的数据上传到 s3_train_data 目录（该目录存在于 S3 中）。但是，它不会将其上传到该存储桶中，而是将其上传到已创建的默认存储桶中，然后使用 S3_train_data 变量创建一个新文件夹目录。

在目录中输入的代码

import os
import sagemaker
from sagemaker import get_execution_role

sagemaker_session = sagemaker.Session()
role = get_execution_role()

bucket = <bucket name>
prefix = <folders1/folders2>
key = <input>


s3_train_data = 's3://{}/{}/{}/'.format(bucket, prefix, key)


#path 'data' is the folder in the Jupyter Instance, contains all the training data
inputs = sagemaker_session.upload_data(path= 'data', key_prefix= s3_train_data)

是代码中的问题还是我如何创建笔记本的更多问题？

score 0 · Accepted Answer

您可以查看示例笔记本，如何上传数据 S3 存储桶有很多方法。我只是给你提示回答。而且您忘记创建一个 boto3 会话来访问 S3 存储桶

这是做到这一点的方法之一。

import os 
import urllib.request
import boto3

def download(url):
    filename = url.split("/")[-1]
    if not os.path.exists(filename):
        urllib.request.urlretrieve(url, filename)


def upload_to_s3(channel, file):
    s3 = boto3.resource('s3')
    data = open(file, "rb")
    key = channel + '/' + file
    s3.Bucket(bucket).put_object(Key=key, Body=data)


# caltech-256
download('http://data.mxnet.io/data/caltech-256/caltech-256-60-train.rec')
upload_to_s3('train', 'caltech-256-60-train.rec')
download('http://data.mxnet.io/data/caltech-256/caltech-256-60-val.rec')
upload_to_s3('validation', 'caltech-256-60-val.rec')

链接：https ://buildcustom.notebook.us-east-2.sagemaker.aws/notebooks/sample-notebooks/introduction_to_amazon_algorithms/imageclassification_caltech/Image-classification-fulltraining.ipynb

另一种方法。

bucket = '<your_s3_bucket_name_here>'# enter your s3 bucket where you will copy data and model artifacts
prefix = 'sagemaker/breast_cancer_prediction' # place to upload training files within the bucket
# do some processing then prepare to push the data. 

f = io.BytesIO()
smac.write_numpy_to_dense_tensor(f, train_X.astype('float32'), train_y.astype('float32'))
f.seek(0)

boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'train', train_file)).upload_fileobj(f)

链接：https ://buildcustom.notebook.us-east-2.sagemaker.aws/notebooks/sample-notebooks/introduction_to_applying_machine_learning/breast_cancer_prediction/Breast%20Cancer%20Prediction.ipynb

Youtube 链接：https ://www.youtube.com/watch?v=-YiHPIGyFGo - 如何提取 S3 存储桶中的数据。

python - AWS将文件上传到错误的存储桶

1 回答 1

Related

Reference