Spark Programming--Fundamental operation

时间：2015-12-30 23:42:48 阅读：259 评论：0 收藏：0 [点我收藏+]

标签：

sc.parallelize()：创建RDD，建议使用xrange

getNumPartitions()：获取分区数

glom()：以分区为单位返回list

collect()：返回list（一般是返回driver program）

例子：

技术分享

sc.textFile(path):读取文件，返回RDD

官网函数：textFile(name, minPartitions=None, use_unicode=True)

支持读取文件：a text file from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI, and return it as an RDD of Strings.

例子（本地文件读取）

技术分享

Spark Programming--Fundamental operation

标签：

原文地址：http://www.cnblogs.com/loadofleaf/p/5090134.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行