首页 > Windows程序 > 详细

API brief(spark for scala )

时间：2017-10-01 10:04:08 阅读：224 评论：0 收藏：0 [点我收藏+]

标签：roc header group marker tin spark com head flat

org.apache.hadoop.mapred.SequenceFileInputFormat<K,V>

获得FileStatus{block size+group+lenth+accesstime+modificationtime+owner+path+permission+symlink+Acl+一些判断是否目录等+序列化到out+set函数}
通过input获得RecordReader。将byte转为record-oriented，为下一步的MR准备。processing record boundaries and presenting the tasks with keys and values.。可{关闭InputSplit+创建key+创建value+返回目前input位置+从input中读取下一个k-v对}

org.apache.hadoop.io.SequenceFile

flat files 由01k-v对组成。Writer Reader和Sorter 三部分。
基于CompressionType有三种writers，并shared a common header。compress的block size，使用的algorithm都configurable。
推荐使用static createWriter
format: Header Record sync-marker. Header:version+calss of k,v +compression+blockcompression+compression codec+metadata+sync

API brief(spark for scala )

标签：roc header group marker tin spark com head flat

原文地址：http://www.cnblogs.com/yumanman/p/7616626.html

踩

(0)

赞

(0)

举报

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行

更多

友情链接

兰亭集智国之画百度统计站长统计阿里云 chrome插件新版天听网

关于我们 - 联系我们 - 留言反馈

© 2014 mamicode.com 版权所有联系我们:gaon5@hotmail.com

迷上了代码！