google-cloud-ml - Vertex AI 端点超时

Question

我正在使用 vertex-ai 端点来提供深度学习服务。

根据输入的大小，我的服务大约需要 30 秒到 2 分钟才能响应 CPU。我注意到当输入大小需要超过一分钟才能响应时，API 失败，给我这个错误：

<!DOCTYPE html>
<html lang=en>
  <meta charset=utf-8>
  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
  <title>Error 502 (Server Error)!!1</title>
  <style>
    *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}
  </style>
  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>
  <p><b>502.</b> <ins>That’s an error.</ins>
  <p>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.  <ins>That’s all we know.</ins>

当我重试时，我不断收到同样的错误。一旦我减小输入大小，API 就会重新开始工作。由于这些原因，我认为这是一个超时问题。

所以我的问题是：如何更改 vertex-ai 端点中的超时值？我通读了所有文档，似乎在任何地方都没有提到它。

谢谢你。

score 1 · Accepted Answer

超时上限约为 60 秒，外加一些额外开销。所以任何接近 2m 的东西绝对是你得到这个错误的原因。它也是不可配置的。

有没有办法加快模型服务开销？比如部署在更快的硬件上，其他模型优化？如果您正在运行自定义容器，可能会利用更多内核，减少任何外部依赖项

google-cloud-ml - Vertex AI 端点超时

1 回答 1

Related

Reference