flume实例二、监听目录日志上传到HDFS文件系统

时间：2015-07-31 19:52:24 阅读：195 评论：0 收藏：0 [点我收藏+]

标签：

一、概述

接实例一，实例一中server-aget是把日志上传保存到服务器上面，随着日志越来越大，公司启动了hadoop项目，需要把日志直接上传hdfs中保存,配置文件target_hdfs.conf如下：

a2.sources = r2
a2.channels = c2
a2.sinks = k2
#source
a2.sources.r2.type = avro
a2.sources.r2.channels = c2
a2.sources.r2.compression-type = deflate
a2.sources.r2.bind = localhost
a2.sources.r2.port = 5281

a2.sources.r2.interceptors = i1
a2.sources.r2.interceptors.i1.type = com.landray.behavior.interceptor.BehaviorServerSerurityInterceptor$Builder

a2.channels = c2
a2.channels.c2.type = file
a2.channels.c2.checkpointDir = ./checkpoint
a2.channels.c2.dataDirs = ./data
a2.channels.c2.transactionCapacity = 20000


a2.sinks = k2
a2.sinks.k2.type = hdfs
a2.sinks.k2.channel = c2
#文件目录，每个月生成一个目录
a2.sinks.k2.hdfs.path = hdfs://192.168.5.126:9000/logs/%Y-%m/

#设置使用时间
a2.sinks.k2.hdfs..useLocalTimeStamp = true
a2.sinks.k2.hdfs.batchSize = 20000
a2.sinks.k2.hdfs.fileType=DataStream  
#不基于时间创建文件
a2.sinks.k2.hdfs.rollInterval=0 
#不基于大小创建文件
a2.sinks.k2.hdfs.rollSize = 0
#不基于个数创建文件
a2.sinks.k2.hdfs.rollCount = 0
a2.sinks.k2.hdfs.threadsPoolSize=15
#操作超时
a2.sinks.k2.hdfs.callTimeout=30000

二、部署安装

1、先安装hadoop，这里安装的是hadoop-2.6.0，必须和flume安装在同一机器上，因为flume在启动过程中会依赖hadoop的lib包，只有配置安装了hadoop之后，在~flume/bin下的flume-ng命令中会查找hadoop的安装目录。如图

2、启动脚本

bin/flume-ng agent --conf conf --conf-file target_file.conf --name a2 -Dflume.root.logger=INFO,console

flume实例二、监听目录日志上传到HDFS文件系统

标签：

原文地址：http://www.cnblogs.com/nemotan/p/4692813.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行