抱歉,无法评论。这不是一个答案,而是关于这个问题的一些想法。我也遇到过类似的问题,但是在使用一个 cassandra 节点测试本地设置时。对 10 行表的最简单请求
cqlsh:db> SELECT * FROM table;
在 CQL shell 中花费不到一秒钟。
但在鲨鱼中大约需要 10 秒。
shark> USE db; SELECT * FROM table;
...
Time taken: 11.274 seconds
Shark 目录中有bin/shark-withinfo
可执行文件,它为请求提供了一些信息。也许它会为您的情况提供一些启示。就我而言,它说要处理我的请求需要执行大量任务。所以我猜工作 schleduer 大部分时间都在吃东西,但我不太确定
...
14/07/09 17:35:19 INFO scheduler.TaskSetManager: Starting task 0.0:255 as TID 255 on executor localhost: localhost (PROCESS_LOCAL)
14/07/09 17:35:19 INFO scheduler.TaskSetManager: Serialized task 0.0:255 as 5456 bytes in 0 ms
14/07/09 17:35:19 INFO executor.Executor: Running task ID 255
14/07/09 17:35:19 INFO scheduler.TaskSetManager: Finished TID 254 in 30 ms on localhost (progress: 255/257)
14/07/09 17:35:19 INFO scheduler.DAGScheduler: Completed ResultTask(0, 254)
14/07/09 17:35:19 INFO storage.BlockManager: Found block broadcast_0 locally
14/07/09 17:35:19 INFO rdd.HadoopRDD: Input split: localhost 9160 org.apache.cassandra.dht.Murmur3Partitioner
14/07/09 17:35:19 INFO cql.HiveCqlInputFormat: Validators : null
14/07/09 17:35:19 INFO exec.FileSinkOperator: Initializing Self 260 FS
14/07/09 17:35:19 INFO exec.FileSinkOperator: Operator 260 FS initialized
14/07/09 17:35:19 INFO exec.FileSinkOperator: Initialization Done 260 FS
14/07/09 17:35:19 INFO exec.FileSinkOperator: Final Path: FS file:...
14/07/09 17:35:19 INFO exec.FileSinkOperator: Writing to temp file: ...
14/07/09 17:35:19 INFO exec.FileSinkOperator: New Final Path: ...
14/07/09 17:35:19 INFO executor.Executor: Serialized size of result for 255 is 563
14/07/09 17:35:19 INFO executor.Executor: Sending result for 255 directly to driver
14/07/09 17:35:19 INFO executor.Executor: Finished task ID 255
...