标签:flume
关于failover网上也有很多例子,但是看到的有多重做法,个人觉得,本着职责单一的原则
1、一台机子运行一个flume agent
2、一个agent 的下游sink指向一个flume agent,不要一个flume agent配置多个端口【影响性能】
3、分机子配置,可以避免一台机子司机,另一个仍可以使用,否则陪在一台机子上通过端口区分,一旦死机,全盘崩溃
下面看具体实例:
首先是flumet agent client的配置
priority越高,优先级越高,会优先使用该sink
# Name the components on this agent a1.sources = r1 a1.sinks = k1 k2 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = exec a1.sources.r1.channels=c1 a1.sources.r1.command=tail -F /root/dev/biz/logs/bizlogic.log #define sinkgroups a1.sinkgroups=g1 a1.sinkgroups.g1.sinks=k1 k2 a1.sinkgroups.g1.processor.type=failover a1.sinkgroups.g1.processor.priority.k1=10 a1.sinkgroups.g1.processor.priority.k2=5 a1.sinkgroups.g1.processor.maxpenalty=10000 #define the sink 1 a1.sinks.k1.type=avro a1.sinks.k1.hostname=192.168.11.179 a1.sinks.k1.port=9876 #define the sink 2 a1.sinks.k2.type=avro a1.sinks.k2.hostname=192.168.11.178 a1.sinks.k2.port=9876 # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1 a1.sinks.k2.channel=c1这里可以看到使用了sinkgroup,其中包括了两个sink,两个sink分别指向不同的flume agent
再来看flume agent server的配置,即179,178的配置,看一个即可
# Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source a1.sources.r1.type=avro #any address to listen a1.sources.r1.bind=0.0.0.0 a1.sources.r1.port=9876 a1.sources.r1.channels=c1 # Describe the sink a1.sinks.k1.type = file_roll a1.sinks.k1.sink.directory=/root/dev/flumeout/file a1.sinks.k1.sink.rollInterval=3600 # Use a channel which buffers events in memory a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
可以看出flume agent client和server之间是通过avro来传输数据的,avro是flume内置的协议,非常方便,可以将flume整个串起来
下面先启动flume agent server,再启动flume agent client
测试如下:
for i in {1..100}; do echo "exec test tail -f $i on terminator 176" >> bizlogic.log; echo $i; sleep 0.1; done往文件中写内容,触发flume agent client的tail -F,这样内容就会通过flume agent client 到memory channel中,在通过failover机制选择优先级高的sink去输出,最终输出的地方,有最后一环的flume配置中sink.type决定,可以看出是file_roll,也就是文件形式写到磁盘上,会按照一定方式滚动
起初启动的时候,178和179都会产生此文件,但是当你开始产生文件内容的时候,也还有179才会写入文件内容了
至此,完整的flume failover 机制就走通了,共勉!
标签:flume
原文地址:http://blog.csdn.net/simonchi/article/details/42494461