Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Invalid
-
2.4.4
-
None
-
None
-
Apache Spark 2.4.4
Windows 10
Description
First issue I post here. I noticed that when I call RDD.cache() early in my code, the results are all wrong!
If I remove the call to cache(), or I add cache later in the code, after the first map transformation, it works fine.
The graph is created from a data structure that already contains the random.
I have posted versions that work, and versions that don't work here in this gist.
https://gist.github.com/mitchi/edd9637687cf47fac2616bb72932f8e7
here is an output that works :
Colors of the graph
3 2 1 3 2 1 1 4 2 3
and an output that doesn't work :
Colors of the graph
25 16 36 49 3 1 6 15 10 3