1

我是 AWS 和 CloudSearch 的新手。我编写了一个非常简单的应用程序,它将 docx 文档(已经使用 cs-import-document 转换为 JSON 格式)上传到我的 seach 域。

代码非常简单,如下所示:

using (var searchdomainclient = new AmazonCloudSearchDomainClient("http://search-xxxxx-xysjxyuxjxjxyxj.ap-southeast-2.cloudsearch.amazonaws.com"))
{

    // Test to upload doc                            

    var uploaddocrequest = new UploadDocumentsRequest()
    {
        FilePath = @"c:\temp\testsearch.sdf",  //docx to JSON already
        ContentType =  ContentType.ApplicationJson

    };
    var uploadresult = searchdomainclient.UploadDocuments(uploaddocrequest);

   }

但是我得到的例外是:“缺少根元素。”

这是我要上传的 sdf 文件中的 JSON 内容:

[{
    "type": "add",
    "id": "c:_temp_testsearch.docx",
    "fields": {
        "template": "Normal.dotm",
        "application_name": "Microsoft Office Word",
        "paragraph_count": "1",
        "resourcename": "testsearch.docx",
        "date": "2014-07-28T23:52:00Z",
        "xmptpg_npages": "1",
        "page_count": "1",
        "publisher": "",
        "creator": "John Smith",
        "creation_date": "2014-07-28T23:52:00Z",
        "content": "Test5",
        "author": "John Smith",
        "last_modified": "2014-07-29T04:22:00Z",
        "revision_number": "3",
        "line_count": "1",
        "application_version": "15.0000",
        "last_author": "John Smith",
        "character_count": "5",
        "character_count_with_spaces": "5",
        "content_type": "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
    }
}]

那么我的方法有什么问题呢?

多谢!

PS 我可以手动将 docx doc 上传到该搜索域并使用 C# 代码应用搜索。




============= 更新 2014-08-04 ===================

我不确定它是否与此有关。在堆栈跟踪中,我发现它尝试解析为 XML 文件而不是 JSON。但是从我的代码中我已经设置了ContentType = JASON,但似乎没有效果。

at System.Xml.XmlTextReaderImpl.ThrowWithoutLineInfo(String res)
at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
at Amazon.Runtime.Internal.Transform.XmlUnmarshallerContext.Read()
at Amazon.Runtime.Internal.Transform.ErrorResponseUnmarshaller.Unmarshall(XmlUnmarshallerContext context)
at Amazon.Runtime.Internal.Transform.JsonErrorResponseUnmarshaller.Unmarshall(JsonUnmarshallerContext context)
at Amazon.CloudSearchDomain.Model.Internal.MarshallTransformations.UploadDocumentsResponseUnmarshaller.UnmarshallException(JsonUnmarshallerContext context, Exception innerException, HttpStatusCode statusCode)
at Amazon.Runtime.Internal.Transform.JsonResponseUnmarshaller.UnmarshallException(UnmarshallerContext input, Exception innerException, HttpStatusCode statusCode)
at Amazon.Runtime.AmazonWebServiceClient.HandleHttpWebErrorResponse(AsyncResult asyncResult, WebException we)
at Amazon.Runtime.AmazonWebServiceClient.getResponseCallback(IAsyncResult result)
at Amazon.Runtime.AmazonWebServiceClient.endOperation[T](IAsyncResult result)
at Amazon.CloudSearchDomain.AmazonCloudSearchDomainClient.EndUploadDocuments(IAsyncResult asyncResult)
at Amazon.CloudSearchDomain.AmazonCloudSearchDomainClient.UploadDocuments(UploadDocumentsRequest request)


at Amazon.CloudSearchDomain.Model.Internal.MarshallTransformations.UploadDocumentsResponseUnmarshaller.UnmarshallException(JsonUnmarshallerContext context, Exception innerException, HttpStatusCode statusCode)
4

2 回答 2

2

您的文档 ID 包含无效字符(句点和冒号)。来自https://aws.amazon.com/articles/8871401284621700

ID 在您上传到域的所有文档中必须是唯一的,并且可以包含以下字符:az(小写字母)、0-9 和下划线字符 (_)。文档 ID 必须以字母或数字开头,最长可达 64 个字符。

还不清楚您要发布到哪个端点,但您也可能在那里遇到问题。

于 2014-08-04T04:05:28.763 回答
0

我对 SDK 版本 2.2.2.0 有完全相同的异常。当我将 SDK 更新到 2.2.2.1 版本时,异常消失了。

于 2014-08-06T14:29:40.147 回答