Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
2.4.0
-
None
Description
PCAModel.load() does not seem to be using the configurations set on the current spark session.
Repro:
The following will fail to read the data because the storage account credentials config used/propagated.
conf.set("fs.azure.account.key.test.blob.core.windows.net","Xosad==")
spark = SparkSession.builder.appName("dharmesh").config(conf=conf).master('spark://spark-master:7077').getOrCreate()
model = PCAModel.load('wasb://test@test.blob.core.windows.net/model')
The following however works:
conf.set("fs.azure.account.key.test.blob.core.windows.net","Xosad==")
spark = SparkSession.builder.appName("dharmesh").config(conf=conf).master('spark://spark-master:7077').getOrCreate()
blah = spark.read.json('wasb://test@test.blob.core.windows.net/somethingelse/')
blah.show()
model = PCAModel.load('wasb://test@test.blob.core.windows.net/model')
It looks like spark.read...() does force the use of the config once and then PCAModel.load() will work correctly.