
我在我们的一台服务器上运行 tcollector,我确实在 startstop.sh 中设置了主机



./startstop start

运行所有统计信息收集器。甚至注意到在 tsdb 控制台日志中

[id: 0x5fc4bb31, / => /] CONNECTED: /

在我的 tcollector 节点上,我做了,

ps axl | grep tcollector


0     0 16796 16795  20   0 183712  8000 poll_s Sl   ?          2:17 /usr/bin/python /home/mithralaya/tcollector/tcollector.py -c /home/mithralaya/tcollector/collectors -H -t host=ip-172-31-12-203 -P /var/run/tcollector.pid
4 65534 16806 16796  20   0  39864  3748 poll_s Ss   ?          0:08 /usr/bin/python /home/mithralaya/tcollector/collectors/0/procstats.py
4 65534 16808 16796  39  19  39700  3380 poll_s SNs  ?          0:07 /usr/bin/python /home/mithralaya/tcollector/collectors/0/procnettcp.py
4 65534 16816 16796  20   0  39648  3240 poll_s Ss   ?          0:00 /usr/bin/python /home/mithralaya/tcollector/collectors/0/iostat.py
4 65534 16818 16796  20   0  39648  3400 poll_s Ss   ?          0:01 /usr/bin/python /home/mithralaya/tcollector/collectors/0/ifstat.py
4 65534 16822 16796  20   0  41848  3676 poll_s Ss   ?          0:05 /usr/bin/python /home/mithralaya/tcollector/collectors/0/netstat.py
4 65534 16824 16796  20   0  39648  3524 poll_s Ss   ?          0:00 /usr/bin/python /home/mithralaya/tcollector/collectors/0/dfstat.py
0     0 26617 26171  20   0   8108   940 pipe_w S+   pts/0      0:00 grep --color=auto tcollector

我在 /var/log/tcollector 上的 tcollector 日志中看不到任何重大错误。最新日志

2014-04-15 08:59:40,630 tcollector[16796] WARNING: haproxy.py: Error: HAProxy is not running
2014-04-15 08:59:55,090 tcollector[16796] INFO: removing redis-stats.py from the list of collectors (by request)
2014-04-15 08:59:55,091 tcollector[16796] INFO: removing nfsstat.py from the list of collectors (by request)
2014-04-15 08:59:55,091 tcollector[16796] WARNING: collector hbase_master.py terminated after 16 seconds with status code 1, marking dead
2014-04-15 08:59:55,091 tcollector[16796] INFO: removing udp_bridge.py from the list of collectors (by request)
2014-04-15 08:59:55,091 tcollector[16796] INFO: removing elasticsearch.py from the list of collectors (by request)
2014-04-15 08:59:55,092 tcollector[16796] INFO: removing zfsiostats.py from the list of collectors (by request)
2014-04-15 08:59:55,092 tcollector[16796] INFO: removing varnishstat.py from the list of collectors (by request)
2014-04-15 08:59:55,092 tcollector[16796] INFO: removing mongo.py from the list of collectors (by request)
2014-04-15 08:59:55,093 tcollector[16796] INFO: removing couchbase.py from the list of collectors (by request)
2014-04-15 08:59:55,093 tcollector[16796] INFO: removing graphite_bridge.py from the list of collectors (by request)
2014-04-15 08:59:55,093 tcollector[16796] INFO: removing zfskernstats.py from the list of collectors (by request)
2014-04-15 08:59:55,094 tcollector[16796] INFO: removing smart-stats.py from the list of collectors (by request)
2014-04-15 08:59:55,094 tcollector[16796] WARNING: collector mysql.py terminated after 16 seconds with status code 1, marking dead
2014-04-15 08:59:55,094 tcollector[16796] WARNING: collector hbase_regionserver.py terminated after 16 seconds with status code 1, marking dead
2014-04-15 08:59:55,095 tcollector[16796] INFO: removing postgresql.py from the list of collectors (by request)
2014-04-15 08:59:55,095 tcollector[16796] INFO: removing haproxy.py from the list of collectors (by request)
2014-04-15 08:59:55,095 tcollector[16796] INFO: removing riak.py from the list of collectors (by request)
2014-04-15 08:59:55,095 tcollector[16796] INFO: removing zookeeper.py from the list of collectors (by request)
2014-04-15 08:59:55,096 tcollector[16796] INFO: removing opentsdb.sh from the list of collectors (by request)
2014-04-15 09:09:40,651 tcollector[16796] INFO: Heartbeat (6 collectors running)
2014-04-15 09:19:41,217 tcollector[16796] INFO: Heartbeat (6 collectors running)
2014-04-15 09:29:41,794 tcollector[16796] INFO: Heartbeat (6 collectors running)
2014-04-15 09:39:43,586 tcollector[16796] INFO: Heartbeat (6 collectors running)

但是没有一个统计数据被收集。在 hbase 中, tsdb 和 tsdb-uid 都是空的。

hbase(main):002:0> scan 'tsdb'
ROW                                                          COLUMN+CELL                                                                                                                                                                      
0 row(s) in 0.2890 seconds




所有基于 Hadoop 的技术都很难安装和配置。我花了一周的时间来解决这个问题,我正在运行 tcollector 24 小时,但 TSDB 中仍然没有数据。




3 回答 3


好吧,可能有几个收集器像 procstats.py 一样运行(它收集 cpu、内存等基本指标),我注意到它们不在错误日志中。

您没有将数据获取到您的 hbase 可能是因为您的 opentsdb 配置设置为您需要手动创建指标的默认值。如果是这样,那么您必须自己定义指标。

相反,要创建自动创建的指标,请尝试转到您的 opentsdb 服务器并检查配置并将指标创建设置为自动。


然后再次检查您的 hbase 以查看是否可以在“tsdb-uid”中看到数据。

于 2014-09-12T02:28:05.933 回答

从日志文件输出看来,实际上没有任何 tcollector 插件在运行。由于错误,它们会生成并在之后立即被删除。

于 2014-04-29T02:27:00.043 回答

请尝试将 conf 文件设置为自动创建指标

# --------- CORE ----------
# Whether or not to automatically create UIDs for new metric types, default
# is False
tsd.core.auto_create_metrics = true
于 2014-11-21T09:16:16.597 回答