DataFrameWriter.
sortBy
Sorts the output in each bucket by the given columns on the file system.
col – a name of a column, or a list of names.
cols – additional names (optional). If col is a list it should be empty.
>>> (df.write.format('parquet') ... .bucketBy(100, 'year', 'month') ... .sortBy('day') ... .mode("overwrite") ... .saveAsTable('sorted_bucketed_table'))
New in version 2.3.