按照本指南,我在 AWS Lambda 上运行了一个(Alpine)Docker 映像。
该图像包含app.py一个简单的 .docx -> .pdf 文档转换器。核心是以下代码,它在我本地开发盒上的 Docker 容器中工作,但subprocess.CalledProcessError在实际 Lambda 部署中引发:
def handler(event, context):
src_filename = event['filename']
filename_body, _ = os.path.splitext(src_filename)
src_filepath = '/tmp/test-template.docx'
shutil.copyfile('/home/app/test-template.docx', src_filepath) # for testing
print( subprocess.check_output(['ls', '-l', '/tmp'] ) )
# ^ -rw-rw-r-- 20974 bytes test-template.docx
LIBRE_BINARY = '/usr/bin/soffice'
print( subprocess.check_output(['ls', '-l', LIBRE_BINARY] ) )
# ^ lrwxrwxrwx /usr/bin/soffice -> /usr/lib/libreoffice/program/soffice
MAX_TRIES = 3
success = False
print(f'Processing file: {src_filepath} with LibreOffice')
for kTry in range(MAX_TRIES):
print(f'Conversion Attempt #{kTry}')
try:
# https://stackoverflow.com/questions/4256107/running-bash-commands-in-python
result = subprocess.run(
[
LIBRE_BINARY,
'--headless',
'--invisible',
'--nodefault',
'--nofirststartwizard',
'--nolockcheck',
'--nologo',
'--norestore',
'--convert-to', 'pdf:writer_pdf_Export',
'--outdir', TMP_FOLDER,
src_filepath
],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
shell=False,
check=True,
text=True
)
except subprocess.CalledProcessError as e:
raise RuntimeError(f"\tGot exit code {e.returncode}. Msg: {e.output}") from e
continue
响应字符串是:
[ERROR] RuntimeError: Got exit code 77. Msg:
Traceback (most recent call last):
File "/home/app/app.py", line 82, in handler
raise RuntimeError(f"\tGot exit code {e.returncode}. Msg: {e.output}") from e
这怎么可能在我的本地机器上成功但在 AWS 上失败?
它是相同的容器映像执行。它是完全独立的。问题肯定来自这个subprocess.run命令。
这是我的aws lambda create-function:
aws lambda create-function \
--function-name $AWS_LAMBDAFUNC_NAME \
--role $role_arn \
--code ImageUri=$full_url \
--package-type Image \
--memory-size 8192 \
--timeout 300 \
--publish
我使用了大内存和大超时。
我已阅读到在 /home/app 文件夹之外和 /tmp 之外写入文件系统可能会出现问题。所以我小心翼翼地不使用这样的写入。
那么可能是什么问题呢?
它适用于 BASH
如果我在我的entry.sh它的工作原理中执行此处理:
#!/bin/sh
/usr/bin/soffice \
--headless \
--invisible \
--nodefault \
--nofirststartwizard \
--nolockcheck \
--nologo \
--norestore \
--convert-to pdf:writer_pdf_Export \
--outdir /tmp \
/home/app/test-template.docx \
&> /home/app/output_and_error_file
ls /tmp >> /home/app/output_and_error_file
exec python -m awslambdaric $1
output_and_error_file:
{"response": "convert /home/app/test-template.docx -> /tmp/test-template.pdf using filter : writer_pdf_Export
hsperfdata_root
test-template.pdf"}
所以它一定是subprocess对 Lambda 运行时感到厌烦。
测试:使用os.system
os.system(
f'export HOME=/home/app && {LIBRE_BINARY}' \
f' --headless --invisible --nodefault --nofirststartwizard' \
f' --nolockcheck --nologo --norestore' \
f' --convert-to pdf:writer_pdf_Export' \
f' --outdir {TMP_FOLDER}' \
f' {src_filepath}'
)
这会产生一个更具描述性的错误:
START RequestId: f2c18863-977e-46e4-a138-c1db80759406 Version: $LATEST
Executing 'app.handler' in function directory '/home/app'
b'total 24\n-rw-rw-r-- 1 sbx_user 990 20974 Jan 8 11:01 test-template.docx\n'
/usr/bin/soffice
b'lrwxrwxrwx 1 root root 36 Jan 8 03:56 /usr/bin/soffice -> /usr/lib/libreoffice/program/soffice\n'
Processing file: /tmp/test-template.docx with LibreOffice
Conversion Attempt #0
javaldx failed!
Warning: failed to read path from javaldx
LibreOffice 6.4 - Fatal Error: The application cannot be started.
User installation could not be completed.
Unknown error with saving to S3: <class 'FileNotFoundError'>
END RequestId: f2c18863-977e-46e4-a138-c1db80759406
REPORT RequestId: f2c18863-977e-46e4-a138-c1db80759406 Duration: 2698.32 ms Billed Duration: 5585 ms Memory Size: 8192 MB Max Memory Used: 175 MB Init Duration: 2885.69 ms
soffice --version作品
result = subprocess.run(
[ LIBRE_BINARY, '--version' ],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
shell=False,
check=True,
text=True
)
这很好用!