Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.1.2
-
None
-
Submitting patch again due to flaky tests (nothing that was changed in the patch)
Description
Getting an error in my code that does some basic stuff with HiveStreamingConnection:
19/10/23 15:12:44 ERROR yarn_logger.Main: Thread worker-0 uncaught exception
java.lang.RuntimeException: org.apache.hive.streaming.StreamingException: Unable to close
at com.datto.yarn_logger.WorkerThread.run(WorkerThread.java:51)
Caused by: org.apache.hive.streaming.StreamingException: Unable to close
at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.close(HiveStreamingConnection.java:973)
at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.markDead(HiveStreamingConnection.java:833)
at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.<init>(HiveStreamingConnection.java:677)
at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.<init>(HiveStreamingConnection.java:596)
at org.apache.hive.streaming.HiveStreamingConnection.createNewTransactionBatch(HiveStreamingConnection.java:485)
at org.apache.hive.streaming.HiveStreamingConnection.beginNextTransaction(HiveStreamingConnection.java:466)
at org.apache.hive.streaming.HiveStreamingConnection.beginTransaction(HiveStreamingConnection.java:507)
at com.datto.yarn_logger.WorkerThread.run(WorkerThread.java:49)
Caused by: java.lang.NullPointerException
at org.apache.hive.streaming.AbstractRecordWriter.logStats(AbstractRecordWriter.java:547)
at org.apache.hive.streaming.AbstractRecordWriter.close(AbstractRecordWriter.java:352)
at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.closeImpl(HiveStreamingConnection.java:979)
at org.apache.hive.streaming.HiveStreamingConnection$TransactionBatch.close(HiveStreamingConnection.java:970)
... 7 more
Digging through the stack trace... TransactionBatch will try to catch exception in its constructor, and calls the close method if an exception is thrown (which is definitely happening; that's line 677 in HiveStreamingConnection.java). This eventually calls AbstractRecordWriter::close which, calls logStats, which tries to use heapMemoryMonitor, which is null because presumably init had never been called on the writer yet.
Easy fix would to just check in logStats if heapMemoryMonitor is null, and do the same thing if the method it calls returns null in it.