pyspark.sql.streaming.DataStreamReader.load

DataStreamReader.load(path=None, format=None, schema=None, **options)[source]

Loads a data stream from a data source and returns it as a DataFrame.

Note

Evolving.

Parameters
  • path – optional string for file-system backed data sources.

  • format – optional string for format of the data source. Default to ‘parquet’.

  • schema – optional pyspark.sql.types.StructType for the input schema or a DDL-formatted string (For example col0 INT, col1 DOUBLE).

  • options – all other string options

>>> json_sdf = spark.readStream.format("json") \
...     .schema(sdf_schema) \
...     .load(tempfile.mkdtemp())
>>> json_sdf.isStreaming
True
>>> json_sdf.schema == sdf_schema
True

New in version 2.0.