pyspark.sql.functions.schema_of_csv

pyspark.sql.functions.schema_of_csv(csv, options={})[source]

Parses a CSV string and infers its schema in DDL format.

Parameters
  • col – a CSV string or a string literal containing a CSV string.

  • options – options to control parsing. accepts the same options as the CSV datasource

>>> df = spark.range(1)
>>> df.select(schema_of_csv(lit('1|a'), {'sep':'|'}).alias("csv")).collect()
[Row(csv='struct<_c0:int,_c1:string>')]
>>> df.select(schema_of_csv('1|a', {'sep':'|'}).alias("csv")).collect()
[Row(csv='struct<_c0:int,_c1:string>')]

New in version 3.0.