pyspark.sql.GroupedData.sum

GroupedData.sum(*cols)[source]

Compute the sum for each numeric columns for each group.

Parameters

cols – list of column names (string). Non-numeric columns are ignored.

>>> df.groupBy().sum('age').collect()
[Row(sum(age)=7)]
>>> df3.groupBy().sum('age', 'height').collect()
[Row(sum(age)=7, sum(height)=165)]

New in version 1.3.