cassandra - YCSB 低读取吞吐量卡桑德拉

Question

YCSB端点基准测试会让您相信 Cassandra 是 Nosql 数据库的黄金孩子。然而，在我们自己的机器上重新创建结果（8 个超线程内核、60 GB 内存、2 500 GB SSD），我们对工作负载 b 的读取吞吐量非常低（主要读取，也就是 95% 读取，5% 更新）。

cassandra.yaml 设置与 Endpoint 设置完全相同，除了不同的 IP 地址和我们的磁盘配置（1 个 SSD 用于数据，1 个用于提交日志）。虽然它们的吞吐量约为每秒 38,000 次操作，但我们的吞吐量约为每秒 16,000 次，无论（相对）线程/客户端节点的数量如何。即一个具有 256 个线程的工作节点将报告 ~16,000 ops/sec，而 4 个节点将报告 ~4,000 ops/sec

我已将 SSD 数据驱动器的预读值设置为 8KB。我将把自定义工作负载文件放在下面。

使用 iostat 分析磁盘 io 和 cpu 使用情况时，读取吞吐量似乎始终为 ~200,000 KB/s，这似乎表明 ycsb 集群吞吐量应该更高（记录为 100 字节）。~25-30% 的 cpu 似乎在 %iowait 之下，10-25% 被用户使用。

top 和 nload 统计数据表面上没有瓶颈（<50% 的内存使用率，对于 10 Gb/s 的链路为 10-50 Mbits/sec）。

# The name of the workload class to use
workload=com.yahoo.ycsb.workloads.CoreWorkload

# There is no default setting for recordcount but it is
# required to be set.
# The number of records in the table to be inserted in
# the load phase or the number of records already in the
# table before the run phase.
recordcount=2000000000

# There is no default setting for operationcount but it is
# required to be set.
# The number of operations to use during the run phase.
operationcount=9000000

# The offset of the first insertion
insertstart=0
insertcount=500000000

core_workload_insertion_retry_limit = 10
core_workload_insertion_retry_interval = 1

# The number of fields in a record
fieldcount=10

# The size of each field (in bytes)
fieldlength=10

# Should read all fields
readallfields=true

# Should write all fields on update
writeallfields=false

fieldlengthdistribution=constant

readproportion=0.95

updateproportion=0.05

insertproportion=0

readmodifywriteproportion=0

scanproportion=0

maxscanlength=1000

scanlengthdistribution=uniform

insertorder=hashed

requestdistribution=zipfian
hotspotdatafraction=0.2

hotspotopnfraction=0.8
table=usertable

measurementtype=histogram

histogram.buckets=1000
timeseries.granularity=1000

score 0 · Accepted Answer

关键是增加 casssandra.yaml 文件中的 native_transport_max_threads。

随着评论中设置的增加（增加 ycsb 客户端中的连接以及 cassandra 中的并发读/写），Cassandra 跃升至约 80,000 次操作/秒。

cassandra - YCSB 低读取吞吐量卡桑德拉

1 回答 1

Related

Reference