Monday, December 7, 2015

Column renaming after DataFrame.groupBy and agg


In the following code, the column name is "SUM(_1#179)", is there a way to rename it to a more friendly name?

scala> val d = sqlContext.createDataFrame(Seq((1, 2), (1, 3), (2, 10)))

scala> d.groupBy("_1").sum().printSchema
root
 |-- _1: integer (nullable = false)
 |-- SUM(_1#179): long (nullable = true)
 |-- SUM(_2#180): long (nullable = true)


http://apache-spark-user-list.1001560.n3.nabble.com/Column-renaming-after-DataFrame-groupBy-td22586.html




The simple way to achieve this is using   toDF() function.   

scala> val d = sqlContext.createDataFrame(Seq((1, 2), (1, 3), (2, 10)))
scala> d.groupBy("_1").sum().toDF("a","b","c").printSchema


root
 |-- a: integer (nullable = false)
 |-- b: long (nullable = true)
 |-- c: long (nullable = true)

No comments:

Post a Comment