pyspark.sql.streaming.DataStreamWriter

class pyspark.sql.streaming.DataStreamWriter(df)[source]

Interface used to write a streaming DataFrame to external storage systems (e.g. file systems, key-value stores, etc). Use DataFrame.writeStream to access this.

Note

Evolving.

New in version 2.0.

__init__(df)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(df)

Initialize self.

foreach(f)

Sets the output of the streaming query to be processed using the provided writer f.

foreachBatch(func)

Sets the output of the streaming query to be processed using the provided function.

format(source)

Specifies the underlying output data source.

option(key, value)

Adds an output option for the underlying data source.

options(**options)

Adds output options for the underlying data source.

outputMode(outputMode)

Specifies how data of a streaming DataFrame/Dataset is written to a streaming sink.

partitionBy(*cols)

Partitions the output by the given columns on the file system.

queryName(queryName)

Specifies the name of the StreamingQuery that can be started with start().

start([path, format, outputMode, …])

Streams the contents of the DataFrame to a data source.

trigger([processingTime, once, continuous])

Set the trigger for the stream query.