码迷,mamicode.com
首页 > 编程语言 > 详细

Java API 读取HDFS的单文件

时间:2014-08-19 09:23:54      阅读:188      评论:0      收藏:0      [点我收藏+]

标签:java   使用   os   io   文件   ar   2014   line   

HDFS上的单文件:

-bash-3.2$ hadoop fs -ls /user/pms/ouyangyewei/data/input/combineorder/repeat_rec_category
Found 1 items
-rw-r--r--   2 deploy supergroup        520 2014-08-14 17:03 /user/pms/ouyangyewei/data/input/combineorder/repeat_rec_category/repeatRecCategory.txt
文件内容:

-bash-3.2$ hadoop fs -cat /user/pms/ouyangyewei/data/input/combineorder/repeat_rec_category/repeatRecCategory.txt | more
8104
960985
5472
971917
5320
971895
971902
971922
958261
972047
972050

Java API使用FileSystem方式 读取HDFS单文件的方法

/**
 * 获取可重复推荐的类目,以英文逗号分隔
 * @param filePath
 * @param conf
 * @return
 */
public String getRepeatRecCategoryStr(String filePath) {
	final String DELIMITER = "\t";
	final String INNER_DELIMITER = ",";
	
	String categoryFilterStrs = new String();
	BufferedReader br = null;
	try {
		FileSystem fs = FileSystem.get(new Configuration());
		FSDataInputStream inputStream = fs.open(new Path(filePath));
		br = new BufferedReader(new InputStreamReader(inputStream));
		
		String line = null;
		while (null != (line = br.readLine())) {
			String[] strs = line.split(DELIMITER);
			categoryFilterStrs += (strs[0] + INNER_DELIMITER);
		}
	} catch (IOException e) {
		e.printStackTrace();
	} finally {
		if (null != br) {
			try {
				br.close();
			} catch (IOException e) {
				e.printStackTrace();
			}
		}
	}
	
	return categoryFilterStrs;
}

Java API 读取HDFS的单文件,布布扣,bubuko.com

Java API 读取HDFS的单文件

标签:java   使用   os   io   文件   ar   2014   line   

原文地址:http://blog.csdn.net/yeweiouyang/article/details/38677027

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!