【大数据系列】基于MapReduce的数据处理 SequenceFile序列化文件

时间：2017-08-01 16:37:07 阅读：165 评论：0 收藏：0 [点我收藏+]

标签：class mapr value nod 数据结构 node log pack 大数

为键值对提供持久的数据结构

1、txt纯文本格式，若干行记录

2、SequenceFile

key-value格式，若干行记录，类似于map

3、编写写入和读取的文件

package com.slp;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.SequenceFile;
import org.apache.hadoop.io.SequenceFile.Reader;
import org.apache.hadoop.io.SequenceFile.Writer;
import org.apache.hadoop.io.Text;
import org.junit.Test;
public class TestSequenceFile {

    @Test
    public void write() throws IOException{
        Configuration conf = new Configuration();
        conf.set("fs.defaultFS", "hdfs://www.node1.com:9000/");
        FileSystem fs  = FileSystem.get(conf);
        Path path = new Path("hdfs://www.node1.com:9000/home/hadoop/seq.seq");
        Writer writer  = SequenceFile.createWriter(fs, conf, path, IntWritable.class, Text.class);
        writer.append(new IntWritable(1), new Text("tom1"));
        writer.append(new IntWritable(2), new Text("tom2"));
        writer.append(new IntWritable(3), new Text("tom3"));
        writer.append(new IntWritable(4), new Text("tom4"));
        writer.close();
        System.out.println("over");
    }
    
    @Test
    public void readSeq() throws IOException{
        Configuration conf = new Configuration();
        conf.set("fs.defaultFS", "hdfs://www.node1.com:9000/");
        FileSystem fs  = FileSystem.get(conf);
        Path path = new Path("hdfs://www.node1.com:9000/home/hadoop/seq.seq");
        Reader reader = new SequenceFile.Reader(fs, path, conf);
        IntWritable key = new IntWritable();
        Text value = new Text();
        while(reader.next(key, value)){
            System.out.println(key+"="+value);
        }
        reader.close();
    }
}

测试方法的输出为：

1=tom1
2=tom2
3=tom3
4=tom4

4、查看文件

【大数据系列】基于MapReduce的数据处理 SequenceFile序列化文件

标签：class mapr value nod 数据结构 node log pack 大数

原文地址：http://www.cnblogs.com/dream-to-pku/p/7268947.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行