Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23399

Register a task completion listener first for OrcColumnarBatchReader

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.0
    • 2.3.0
    • SQL
    • None

    Description

      This is related with SPARK-23390.

      Currently, there was a opened file leak for OrcColumnarBatchReader.

      [info] - Enabling/disabling ignoreMissingFiles using orc (648 milliseconds)
      15:55:58.673 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 61.0 (TID 85, localhost, executor driver): TaskKilled (Stage cancelled)
      15:55:58.674 WARN org.apache.spark.DebugFilesystem: Leaked filesystem connection created at:
      java.lang.Throwable
      	at org.apache.spark.DebugFilesystem$.addOpenStream(DebugFilesystem.scala:36)
      	at org.apache.spark.DebugFilesystem.open(DebugFilesystem.scala:70)
      	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)
      	at org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.open(RecordReaderUtils.java:173)
      	at org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:254)
      	at org.apache.orc.impl.ReaderImpl.rows(ReaderImpl.java:633)
      	at org.apache.spark.sql.execution.datasources.orc.OrcColumnarBatchReader.initialize(OrcColumnarBatchReader.java:138)
      

      Attachments

        Issue Links

          Activity

            People

              dongjoon Dongjoon Hyun
              dongjoon Dongjoon Hyun
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: