Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-42932

Spark 3.3.2, with hadoop3, Error with java.io.IOException: Mkdirs failed to create file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.3.2
    • None
    • Spark Core, Spark Submit
    • None

    Description

      We are using spark 3.3.2 with hadoop 3  coming with spark.

      https://www.apache.org/dyn/closer.lua/spark/spark-3.3.2/spark-3.3.2-bin-hadoop3.tgz 

      https://www.apache.org/dyn/closer.lua/spark/spark-3.3.2/spark-3.3.2-bin-hadoop2.tgz 

      Spark in our application is used as standalone , and we are not using HDFS file system.

      Spark is writing on local file system.

      Same spark version 3.3.2 is working fine with hadoop 2. but with hadoop 3 , we are getting below issue. 

       

      23/03/18 20:23:24 WARN TaskSetManager: Lost task 4.0 in stage 0.0 (TID 4) (10.64.109.72 executor 0): java.io.IOException: Mkdirs failed to create file:/var/backup/_temporary/0/_temporary/attempt_202301182023173234741341853025716_0005_m_000004_0 (exists=false, cwd=file:/opt/spark-3.3.2/work/app-20230118202317-0001/0)
              at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:515)
              at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:500)
              at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1195)
              at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1081)
              at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:113)
              at org.apache.spark.internal.io.HadoopMapRedWriteConfigUtil.initWriter(SparkHadoopWriter.scala:238)
              at org.apache.spark.internal.io.SparkHadoopWriter$.executeTask(SparkHadoopWriter.scala:126)
              at org.apache.spark.internal.io.SparkHadoopWriter$.$anonfun$write$1(SparkHadoopWriter.scala:88)
              at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
              at org.apache.spark.scheduler.Task.run(Task.scala:136)
              at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
              at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
              at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:750) 

      Attachments

        Activity

          People

            Unassigned Unassigned
            shamim_er123 shamim
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: