码迷,mamicode.com
首页 > 其他好文 > 详细

Spark 取前几行,先sort再limit

时间:2021-01-02 11:32:57      阅读:0      评论:0      收藏:0      [点我收藏+]

标签:int   field   col   div   limit   str   apach   spark   org   

scala> val df = sc.parallelize(Seq(
     |   (0,"cat26",30.9), 
     |   (1,"cat67",28.5), 
     |   (2,"cat56",39.6),
     |   (3,"cat8",35.6))).toDF("Hour", "Category", "Value")
df: org.apache.spark.sql.DataFrame = [Hour: int, Category: string ... 1 more field]

scala> df.show
+----+--------+-----+
|Hour|Category|Value|
+----+--------+-----+
|   0|   cat26| 30.9|
|   1|   cat67| 28.5|
|   2|   cat56| 39.6|
|   3|    cat8| 35.6|
+----+--------+-----+


scala> df.sort(col("Hour").asc).limit(1)
res6: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [Hour: int, Category: string ... 1 more field]

scala> df.sort(col("Hour").asc).limit(1).show
+----+--------+-----+
|Hour|Category|Value|
+----+--------+-----+
|   0|   cat26| 30.9|
+----+--------+-----+


scala> df.sort(col("Hour").desc).limit(1).show
+----+--------+-----+
|Hour|Category|Value|
+----+--------+-----+
|   3|    cat8| 35.6|
+----+--------+-----+

//默认是升序
scala> df.sort(col("Hour")).limit(1).show
+----+--------+-----+
|Hour|Category|Value|
+----+--------+-----+
|   0|   cat26| 30.9|
+----+--------+-----+

Spark 取前几行,先sort再limit

标签:int   field   col   div   limit   str   apach   spark   org   

原文地址:https://www.cnblogs.com/v5captain/p/14208557.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!