码迷,mamicode.com
首页 > 其他好文 > 详细

Spark2 DataSet 创建新行之flatMap

时间:2016-11-28 20:42:17      阅读:250      评论:0      收藏:0      [点我收藏+]

标签:book   log   graph   hbase   mllib   val   rds   oop   scala   

val dfList = List(("Hadoop", "Java,SQL,Hive,HBase,MySQL"), ("Spark", "Scala,SQL,DataSet,MLlib,GraphX"))
dfList: List[(String, String)] = List((Hadoop,Java,SQL,Hive,HBase,MySQL), (Spark,Scala,SQL,DataSet,MLlib,GraphX))

case class Book(title: String, words: String)

val df=dfList.map{p=>Book(p._1,p._2)}.toDS()
df: org.apache.spark.sql.Dataset[Book] = [title: string, words: string]

df.show
+------+--------------------+
| title|               words|
+------+--------------------+
|Hadoop|Java,SQL,Hive,HBa...|
| Spark|Scala,SQL,DataSet...|
+------+--------------------+

df.flatMap(_.words.split(",")).show
+-------+
|  value|
+-------+
|   Java|
|    SQL|
|   Hive|
|  HBase|
|  MySQL|
|  Scala|
|    SQL|
|DataSet|
|  MLlib|
| GraphX|
+-------+

 

Spark2 DataSet 创建新行之flatMap

标签:book   log   graph   hbase   mllib   val   rds   oop   scala   

原文地址:http://www.cnblogs.com/wwxbi/p/6110803.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!