1

我正在使用带有 hbase0.94.10 和 hadoop1.1.2 的 apache-flume1.4.0。水槽代理有 spool 目录作为源和 hbase 作为接收器和文件通道。它运行成功但非常慢。我应该怎么做才能提高 hbase 写入性能。

Flume 代理配置如下:

agent1.sources = spool
agent1.channels = fileChannel
agent1.sinks = sink

agent1.sources.spool.type = spooldir
agent1.sources.spool.spoolDir = /opt/spoolTest/
agent1.sources.spool.fileSuffix = .completed
agent1.sources.spool.channels = fileChannel
#agent1.sources.spool.deletePolicy = immediate

agent1.sinks.sink.type = org.apache.flume.sink.hbase.HBaseSink
agent1.sinks.sink.channel = fileChannel
agent1.sinks.sink.table = test
agent1.sinks.sink.columnFamily = log
agent1.sinks.sink.serializer = org.apache.flume.sink.hbase.RegexHbaseEventSerializer
agent1.sinks.sink.serializer.regex = (.*)^C(.*)^C(.*)^C(.*)^C(.*)^C(.*)^C(.*)^C(.*)^C(.*)^C(.*)^C(.*)^C(.*)^C(.*)^C(.*)
agent1.sinks.sink.serializer.colNames = id,no_fill_reason,adInfo,locationInfo,handsetInfo,siteInfo,reportDate,ipaddress,headerContent,userParaContent,reqParaContent,otherPara,others,others1
agent1.sinks.sink1.batchSize = 100

agent1.channels.fileChannel.type = file
agent1.channels.fileChannel.checkpointDir = /usr/flumeFileChannel/chkpointFlume
agent1.channels.fileChannel.dataDirs = /usr/flumeFileChannel/dataFlume
agent1.channels.fileChannel.capacity = 10000000
agent1.channels.fileChannel.transactionCapacity = 100000

应该是容量,文件通道的事务容量和接收器的批量大小。

请帮我。提前致谢。

4

0 回答 0