嗨,我将https://wiki.apache.org/tika/TikaJAXRS部署到服务器,当我上传文件并调用时,/meta
我得到以下 docx 文件的响应
u'{"Content-Encoding":"UTF-16LE","Content-Type":"application/json; charset\u003dUTF-16LE","X-Parsed-By":["org.apache.tika.parser .DefaultParser","org.apache.tika.parser.txt.TXTParser"],"language":"bn"}')
1.文件语言是英文但tika返回'bn'?
2.这是我将获得的唯一元数据吗?文件所有者等怎么样?
代码:我使用 python
body= open('/home/Desktop/aws/0.docx','rb')
files = {'upload_file': body}
headers = {'content-type': 'application/octet-stream'}
r = requests.put('http://xx.xx/meta',
files=files,headers=headers)
print('text',r.text)