marklogic - 使用 xdmp:http-post 时发生 XDMP-TOOBIG 错误

Question

我有一个 xquery 文件，它返回超过 2.2GB 的文本数据。当我直接在浏览器（Chrome）中点击 xquery 文件时，它会加载所有文本数据。

但是，当我尝试使用xdmp:http-post($url,$options)它对该 xquery 文件进行发布调用时，会引发 XDMP-TOOBIG 错误。下面是痕迹。

XDMP-TOOBIG: xdmp:http-post("http://server:8278/services/getText...", <options xmlns="xdmp:http"><timeout>600000</timeout><authentication method="basic"><usernam...</options>) -- Document size exceeds text document size limit of 2048 megabytes
in /services/invoke.xqy, at 20:7 [1.0-ml]
$HTTP_CALL = <configurations xmlns:config="" xmlns=""><credentails><username>admin</username><password>admin</password...</configurations>
$userName = text{"admin"}
$password = text{"admin"}
$timeOut = text{"600000"}
$url = "http://server:8278/services/getText..."
$responseType = "text/plain"
$options = <options xmlns="xdmp:http"><timeout>600000</timeout><authentication method="basic"><usernam...</options>
$response = xdmp:http-post("http://server:8278/services/getText...", <options xmlns="xdmp:http"><timeout>600000</timeout><authentication method="basic"><usernam...</options>)
$set-reponse-type = ()

我可以在使用 xdmp:http-post 或任何其他解决方案的文件中指定任何限制吗？

帮助表示赞赏。

score 1 · Accepted Answer

当使用 HTTP 从 MarkLogic 中调用外部服务器时，结果必须适合内存，可能有多个副本，具体取决于您的操作。文本变量未针对超大数据进行优化。根据远程服务的详细信息，您可以通过使用分页 HTTP 请求（使用Range Request Headers）来容纳大数据

即使取消 2G 限制，性能也会很差且不可靠：使用单个 HTTP 请求传输大量数据变得越来越不可靠，因为任何严重的网络错误都需要完全重试。

或者，可以扩充服务或本地代理服务以将数据存储在共享位置，例如已安装的文件系统或 S3，并返回对数据的引用而不是其主体。然后可以使用 xdmp:filesystem-xxx 和 xdmp:binary-xxx 函数来访问数据。

一旦进入内存，将大文本数据作为单个字符串操作也会有问题。如果您需要访问单个大对象，则可以使用二进制文档（内部或外部）以获得更好的可靠性。

如果可以将 HTTP 请求转换为使用 GET 而不是 POST，则可以使用 xdmp:document-load 将结果直接流式传输到文档中。

对xdmp:document-load文档的评论建议可以使用 POST 或 GET 的“rest:”uri 前缀将结果直接流式传输到数据库，尽管我不知道如何以这种方式传递 POST。

marklogic - 使用 xdmp:http-post 时发生 XDMP-TOOBIG 错误

1 回答 1

Related

Reference