0

I am new in Big Data and HBase, in participle. Now I am trying to use OpenTSDB to store data from sensors.

Configuration is: Cloudera vmware image with the last stable OpenTSDB installed on it. After configuring, I started server with

./build/tsdb tsd --port=4242 --staticroot=build/staticroot/ --cachedir=/tmp/tsd/ --auto-metric

Then, I ran simple netcat client:

#!/bin/bash
set -e
while true; do
  ./run $1 $2
  sleep 1
done | nc -w 30 localhost 4242

With ./run compiled from:

#include <cstdio>
#include <cstdlib>
#include <time.h>       /* time */

int main(int argc, char **argv)
{
  if ( argc <= 2 ) {
    fprintf(stderr, "2 param excepted: start point and number of sensors\n");
    return 1;
  }

  unsigned long t = time(NULL);
  srand(t);

  int b;   // index of first sensor
  int n;   // number of sensors
  sscanf(argv[1], "%d", &b);
  sscanf(argv[2], "%d", &n);

  for ( int i = b; i < b+n; ++i ) {
    printf("put democ.%d %d %lf host=localhost.localdomain\n", i, t, 1.0 + 0.01 * (rand() % 100));
  }

  return 0;
}

And afterwards watching for democ.%d metricas via localhost:4242.

I am satisfied with its performance, but there are problems when the generator produces a large number of metrics (n).

First problem is dissapearing of some datapoints. It depends of n. If n = 10000, there are 29 points in 30 seconds on the average. But if n = 75000, there are only 15 points. This problem is not critical. I think, it causes by disk bandwidth.

After some time, the server sends an error:

put: HBase error: 1000 RPCs waiting on "tsdb,\x00\x98[Q\x96E\xF0\x00\x00\x01\x00\x00\x01,1368809980414.dc6179de43f78eac6c8b745996200664." to come back online

Second problem is HBase failure, after the server has been running for some time. OpenTSDB dies with massive flooding to all clients and own console with such message:

put: HBase error: 10000 RPCs waiting on "-ROOT-,,0" to come back online

What can I do to solve this problem?

I also thought about the possibility of using Cassandra for my project.

What the best opensource solution to store time series data (approximately, I need to store data from 100 000 sensors for 30 days, while each sensor generates up to 40 bytes of data every second).

4

1 回答 1

1

有关的错误"RPCs waiting on ..."是由于 HBase 跟不上而引起的。OpenTSDB 会将数据点保留在内存中并重试到一定的限制。但是过了某个点,它会开始丢弃数据并将此错误返回给您,以表明存在问题。

就像任何数据库(分布式或非分布式)一样,您需要在 HBase 上进行基本调整。通常,对新手最有用的两个建议是:

  1. 确保最大区域大小足够大,这样您就不会经常拆分。
  2. 预先创建区域以避免启动时停滞(最近在邮件列表中讨论了这一点)

关于最后一个问题waiting on "-ROOT-,,0"的预期较少。您提到了 HBase 故障:您是否真的在测试期间看到 HBase 死机?如果是,请检查它是否因为内存不足或经历过长时间的 GC 暂停而导致它失去 ZooKeeper 会话(这迫使它按设计自杀)而死亡。由于您提到在 VMware 映像中运行,我假设您处于用于测试的受限环境中,因此请确保 HBase(以及运行它的 VM)为您的写入繁重的工作负载提供足够的内存。

于 2013-05-20T04:49:31.747 回答