Details
Description
Hello,
I am in the phase of learning Spark as part of it trying examples. I am using the following lines of code as below for my file named df.scala:
import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder().getOrCreate()
val df = spark.read.csv("CitiGroup2006_2008")
df.Head(5)
In my Scala Terminal:
scala> :load df.scala
Loading df.scala...
import org.apache.spark.sql.SparkSession
spark: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSession@4756e5cc
org.apache.spark.sql.AnalysisException: Path does not exist: file:/C:/Spark/MyPrograms/Scala_and_Spark_Bootcamp_master/SparkD
ataFrames/CitiGroup2006_2008;
at org.apache.spark.sql.execution.datasources.DataSource$.org$apache$spark$sql$execution$datasources$DataSource$$checkAndGl
obPathIfNecessary(DataSource.scala:715)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$15.apply(DataSource.scala:389)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$15.apply(DataSource.scala:389)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.immutable.List.flatMap(List.scala:344)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:388)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:594)
at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:473)
... 72 elided
<console>:25: error: not found: value df
df.Head(5)
^
all environment variables are set and pointed. Is this a version issue of Spark 2.3.0 or should i degrade the version if so please let me know which version is stable to do my practicals