pyspark.sql.DataFrame.semanticHash

DataFrame.semanticHash()[source]

Returns a hash code of the logical query plan against this DataFrame.

Note

Unlike the standard hash code, the hash is calculated against the query plan simplified by tolerating the cosmetic differences such as attribute names.

Note

DeveloperApi

>>> spark.range(10).selectExpr("id as col0").semanticHash()  
1855039936
>>> spark.range(10).selectExpr("id as col1").semanticHash()  
1855039936

New in version 3.1.