pyspark.sql.functions.
schema_of_csv
Parses a CSV string and infers its schema in DDL format.
col – a CSV string or a string literal containing a CSV string.
options – options to control parsing. accepts the same options as the CSV datasource
>>> df = spark.range(1) >>> df.select(schema_of_csv(lit('1|a'), {'sep':'|'}).alias("csv")).collect() [Row(csv='struct<_c0:int,_c1:string>')] >>> df.select(schema_of_csv('1|a', {'sep':'|'}).alias("csv")).collect() [Row(csv='struct<_c0:int,_c1:string>')]
New in version 3.0.