Spark2 Dataset聚合操作

智慧先行者發表於2016-11-25
data.groupBy("gender").agg(count($"age"),max($"age").as("maxAge"), avg($"age").as("avgAge")).show
+------+----------+------+------+                                               
|gender|count(age)|maxAge|avgAge|
+------+----------+------+------+
|female|         5|  32.0|  29.0|
|  male|         5|  57.0|  39.0|
+------+----------+------+------+


data.groupBy("gender").agg("age"->"count","age" -> "max", "age" -> "avg").show
+------+----------+--------+--------+                                           
|gender|count(age)|max(age)|avg(age)|
+------+----------+--------+--------+
|female|         5|    32.0|    29.0|
|  male|         5|    57.0|    39.0|
+------+----------+--------+--------+

 

相關文章