我有同样的经历。首先,我将 1 个表从 oracle 导入到 hadoop 2.7.1,然后通过钻查询。这是我通过 Web UI 设置的插件配置:
{
"type": "file",
"enabled": true,
"connection": "hdfs://192.168.19.128:8020",
"workspaces": {
"hdf": {
"location": "/user/hdf/my_data/",
"writable": false,
"defaultInputFormat": "csv"
},
"tmp": {
"location": "/tmp",
"writable": true,
"defaultInputFormat": null
}
},
"formats": {
"csv": {
"type": "text",
"extensions": [
"csv"
],
"delimiter": ","
}
}
}
然后,在钻 cli 中,查询如下:
USE hdfs.hdf
SELECT * FROM part-m-00000
此外,在 hadoop 文件系统中,当我 cat 'part-m-00000' 的内容时,控制台上会打印以下格式:
2015-11-07 17:45:40.0,6,8
2014-10-02 12:25:20.0,10,1