Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30585

scalatest fails for Apache Spark SQL project

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.4.0
    • None
    • Build
    • None

    Description

      Error logs:-

      23:36:49.039 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 3.0 (TID 6, localhost, executor driver): TaskKilled (Stage cancelled)
      23:36:49.039 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 1.0 in stage 3.0 (TID 7, localhost, executor driver): TaskKilled (Stage cancelled)
      23:36:51.354 WARN org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor: Current batch is falling behind. The trigger interval is 100 milliseconds, but spent 1854 milliseconds
      23:36:51.381 WARN org.apache.spark.sql.execution.streaming.continuous.ContinuousQueuedDataReader$DataReaderThread: data reader thread failed
      org.apache.spark.SparkException: Exception thrown in awaitResult:
      at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
      at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
      at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92)
      at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:76)
      at org.apache.spark.sql.execution.streaming.sources.ContinuousMemoryStreamInputPartitionReader.getRecord(ContinuousMemoryStream.scala:195)
      at org.apache.spark.sql.execution.streaming.sources.ContinuousMemoryStreamInputPartitionReader.next(ContinuousMemoryStream.scala:181)
      at org.apache.spark.sql.execution.streaming.continuous.ContinuousQueuedDataReader$DataReaderThread.run(ContinuousQueuedDataReader.scala:143)
      Caused by: org.apache.spark.SparkException: Could not find ContinuousMemoryStreamRecordEndpoint-f7d4460c-9f4e-47ee-a846-258b34964852-9.
      at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:160)
      at org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:135)
      at org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:229)
      at org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:523)
      at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:91)
      ... 4 more
      23:36:51.389 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 1.0 in stage 4.0 (TID 9, localhost, executor driver): TaskKilled (Stage cancelled)
      23:36:51.390 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 4.0 (TID 8, localhost, executor driver): TaskKilled (Stage cancelled)

      • flatMap
        23:36:51.754 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 1.0 in stage 5.0 (TID 11, localhost, executor driver): TaskKilled (Stage cancelled)
        23:36:51.754 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 5.0 (TID 10, localhost, executor driver): TaskKilled (Stage cancelled)
        23:36:52.248 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 1.0 in stage 6.0 (TID 13, localhost, executor driver): TaskKilled (Stage cancelled)
        23:36:52.249 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 6.0 (TID 12, localhost, executor driver): TaskKilled (Stage cancelled)
      • filter
        23:36:52.611 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 7.0 (TID 14, localhost, executor driver): TaskKilled (Stage cancelled)
        23:36:52.611 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 1.0 in stage 7.0 (TID 15, localhost, executor driver): TaskKilled (Stage cancelled)
      • deduplicate
      • timestamp
        23:36:53.015 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 8.0 (TID 16, localhost, executor driver): TaskKilled (Stage cancelled)
        23:36:53.015 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 1.0 in stage 8.0 (TID 17, localhost, executor driver): TaskKilled (Stage cancelled)
      • subquery alias
        23:36:53.572 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 1.0 in stage 9.0 (TID 19, localhost, executor driver): TaskKilled (Stage cancelled)
        23:36:53.572 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 9.0 (TID 18, localhost, executor driver): TaskKilled (Stage cancelled)
        23:36:53.953 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 1.0 in stage 10.0 (TID 21, localhost, executor driver): TaskKilled (Stage cancelled)
        23:36:53.953 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 10.0 (TID 20, localhost, executor driver): TaskKilled (Stage cancelled)
        23:36:54.552 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 1.0 in stage 11.0 (TID 23, localhost, executor driver): TaskKilled (Stage cancelled)
        23:36:54.552 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 11.0 (TID 22, localhost, executor driver): TaskKilled (Stage cancelled)
      • repeatedly restart
        23:36:54.591 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 1.0 in stage 12.0 (TID 25, localhost, executor driver): TaskKilled (killed via SparkContext.killTaskAttempt)
        23:36:54.594 ERROR org.apache.spark.util.Utils: Aborting task
        org.apache.spark.sql.execution.streaming.continuous.ContinuousTaskRetryException: Continuous execution does not support task retry
        at org.apache.spark.sql.execution.streaming.continuous.ContinuousDataSourceRDD.compute(ContinuousDataSourceRDD.scala:68)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.sql.execution.streaming.continuous.ContinuousWriteRDD$$anonfun$compute$1.apply$mcV$sp(ContinuousWriteRDD.scala:52)
        at org.apache.spark.sql.execution.streaming.continuous.ContinuousWriteRDD$$anonfun$compute$1.apply(ContinuousWriteRDD.scala:51)
        at org.apache.spark.sql.execution.streaming.continuous.ContinuousWriteRDD$$anonfun$compute$1.apply(ContinuousWriteRDD.scala:51)
        at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
        at org.apache.spark.sql.execution.streaming.continuous.ContinuousWriteRDD.compute(ContinuousWriteRDD.scala:76)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
        23:36:54.594 ERROR org.apache.spark.sql.execution.streaming.continuous.ContinuousWriteRDD: Writer for partition 1 is aborting.
        23:36:54.594 ERROR org.apache.spark.sql.execution.streaming.continuous.ContinuousWriteRDD: Writer for partition 1 aborted.
        23:36:54.595 ERROR org.apache.spark.executor.Executor: Exception in task 1.1 in stage 12.0 (TID 26)
        org.apache.spark.sql.execution.streaming.continuous.ContinuousTaskRetryException: Continuous execution does not support task retry
        at org.apache.spark.sql.execution.streaming.continuous.ContinuousDataSourceRDD.compute(ContinuousDataSourceRDD.scala:68)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.sql.execution.streaming.continuous.ContinuousWriteRDD$$anonfun$compute$1.apply$mcV$sp(ContinuousWriteRDD.scala:52)
        at org.apache.spark.sql.execution.streaming.continuous.ContinuousWriteRDD$$anonfun$compute$1.apply(ContinuousWriteRDD.scala:51)
        at org.apache.spark.sql.execution.streaming.continuous.ContinuousWriteRDD$$anonfun$compute$1.apply(ContinuousWriteRDD.scala:51)
        at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
        at org.apache.spark.sql.execution.streaming.continuous.ContinuousWriteRDD.compute(ContinuousWriteRDD.scala:76)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

       

      SPARK-10849: jdbc CreateTableColumnTypes duplicate columns

      • SPARK-10849: jdbc CreateTableColumnTypes invalid columns
        23:38:30.300 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 76.0 (TID 98)
        org.h2.jdbc.JdbcBatchUpdateException: NULL not allowed for column "NAME"; SQL statement:
        INSERT INTO TEST.PEOPLE1 ("NAME","THEID") VALUES (?,?) [23502-195]
        at org.h2.jdbc.JdbcPreparedStatement.executeBatch(JdbcPreparedStatement.java:1234)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:672)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:834)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:834)
        at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
        at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
        Caused by: org.h2.jdbc.JdbcSQLException: NULL not allowed for column "NAME"; SQL statement:
        INSERT INTO TEST.PEOPLE1 ("NAME","THEID") VALUES (?,?) [23502-195]
        at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
        at org.h2.message.DbException.get(DbException.java:179)
        at org.h2.message.DbException.get(DbException.java:155)
        at org.h2.table.Column.validateConvertUpdateSequence(Column.java:345)
        at org.h2.table.Table.validateConvertUpdateSequence(Table.java:793)
        at org.h2.command.dml.Insert.insertRows(Insert.java:151)
        at org.h2.command.dml.Insert.update(Insert.java:114)
        at org.h2.command.CommandContainer.update(CommandContainer.java:101)
        at org.h2.command.Command.executeUpdate(Command.java:260)
        at org.h2.jdbc.JdbcPreparedStatement.executeUpdateInternal(JdbcPreparedStatement.java:164)
        at org.h2.jdbc.JdbcPreparedStatement.executeBatch(JdbcPreparedStatement.java:1215)
        ... 15 more
        org.h2.jdbc.JdbcSQLException: NULL not allowed for column "NAME"; SQL statement:
        INSERT INTO TEST.PEOPLE1 ("NAME","THEID") VALUES (?,?) [23502-195]
        at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
        at org.h2.message.DbException.get(DbException.java:179)
        at org.h2.message.DbException.get(DbException.java:155)
        at org.h2.table.Column.validateConvertUpdateSequence(Column.java:345)
        at org.h2.table.Table.validateConvertUpdateSequence(Table.java:793)
        at org.h2.command.dml.Insert.insertRows(Insert.java:151)
        at org.h2.command.dml.Insert.update(Insert.java:114)
        at org.h2.command.CommandContainer.update(CommandContainer.java:101)
        at org.h2.command.Command.executeUpdate(Command.java:260)
        at org.h2.jdbc.JdbcPreparedStatement.executeUpdateInternal(JdbcPreparedStatement.java:164)
        at org.h2.jdbc.JdbcPreparedStatement.executeBatch(JdbcPreparedStatement.java:1215)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:672)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:834)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:834)
        at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
        at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
        23:38:30.305 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 76.0 (TID 98, localhost, executor driver): org.h2.jdbc.JdbcBatchUpdateException: NULL not allowed for column "NAME"; SQL statement:
        INSERT INTO TEST.PEOPLE1 ("NAME","THEID") VALUES (?,?) [23502-195]
        at org.h2.jdbc.JdbcPreparedStatement.executeBatch(JdbcPreparedStatement.java:1234)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:672)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:834)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:834)
        at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
        at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
        Caused by: org.h2.jdbc.JdbcSQLException: NULL not allowed for column "NAME"; SQL statement:
        INSERT INTO TEST.PEOPLE1 ("NAME","THEID") VALUES (?,?) [23502-195]
        at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
        at org.h2.message.DbException.get(DbException.java:179)
        at org.h2.message.DbException.get(DbException.java:155)
        at org.h2.table.Column.validateConvertUpdateSequence(Column.java:345)
        at org.h2.table.Table.validateConvertUpdateSequence(Table.java:793)
        at org.h2.command.dml.Insert.insertRows(Insert.java:151)
        at org.h2.command.dml.Insert.update(Insert.java:114)
        at org.h2.command.CommandContainer.update(CommandContainer.java:101)
        at org.h2.command.Command.executeUpdate(Command.java:260)
        at org.h2.jdbc.JdbcPreparedStatement.executeUpdateInternal(JdbcPreparedStatement.java:164)
        at org.h2.jdbc.JdbcPreparedStatement.executeBatch(JdbcPreparedStatement.java:1215)
        ... 15 more
        org.h2.jdbc.JdbcSQLException: NULL not allowed for column "NAME"; SQL statement:
        INSERT INTO TEST.PEOPLE1 ("NAME","THEID") VALUES (?,?) [23502-195]
        at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
        at org.h2.message.DbException.get(DbException.java:179)
        at org.h2.message.DbException.get(DbException.java:155)
        at org.h2.table.Column.validateConvertUpdateSequence(Column.java:345)
        at org.h2.table.Table.validateConvertUpdateSequence(Table.java:793)
        at org.h2.command.dml.Insert.insertRows(Insert.java:151)
        at org.h2.command.dml.Insert.update(Insert.java:114)
        at org.h2.command.CommandContainer.update(CommandContainer.java:101)
        at org.h2.command.Command.executeUpdate(Command.java:260)
        at org.h2.jdbc.JdbcPreparedStatement.executeUpdateInternal(JdbcPreparedStatement.java:164)
        at org.h2.jdbc.JdbcPreparedStatement.executeBatch(JdbcPreparedStatement.java:1215)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:672)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:834)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:834)
        at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
        at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$28.apply(RDD.scala:935)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2101)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

      23:38:30.305 ERROR org.apache.spark.scheduler.TaskSetManager: Task 0 in stage 76.0 failed 1 times; aborting job

      • SPARK-19726: INSERT null to a NOT NULL column
      • SPARK-23856 Spark jdbc setQueryTimeout option !!! IGNORED !!!
        OuterJoinSuite:

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            rashmi_sakhalkar Rashmi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: