Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
Correctness - Consistency
-
Normal
Description
I'm seeing the following exception running Cassandra 3.0.11 on 21 node cluster in two AWS regions when half of the nodes in one region go down, and the load is high on the rest of the nodes:
WARN [SharedPool-Worker-10] 2017-06-14 12:57:15,017 AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-10,5,main]: {} java.lang.RuntimeException: java.nio.BufferOverflowException at org.apache.cassandra.service.StorageProxy$HintRunnable.run(StorageProxy.java:2549) ~[apache-cassandra-3.0.11.jar:3.0.11] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0-zing_17.03.1.0] at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) [apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-3.0.11.jar:3.0.11] at java.lang.Thread.run(Thread.java:745) [na:1.8.0-zing_17.03.1.0] Caused by: java.nio.BufferOverflowException: null at org.apache.cassandra.io.util.DataOutputBufferFixed.doFlush(DataOutputBufferFixed.java:52) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:195) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.writeUnsignedVInt(BufferedDataOutputStreamPlus.java:258) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.utils.ByteBufferUtil.writeWithVIntLength(ByteBufferUtil.java:296) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.db.Columns$Serializer.serialize(Columns.java:405) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.db.SerializationHeader$Serializer.serializeForMessaging(SerializationHeader.java:407) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:120) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:87) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.serialize(PartitionUpdate.java:625) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.db.Mutation$MutationSerializer.serialize(Mutation.java:305) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.hints.Hint$Serializer.serialize(Hint.java:141) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.hints.HintsBuffer$Allocation.write(HintsBuffer.java:251) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.hints.HintsBuffer$Allocation.write(HintsBuffer.java:230) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.hints.HintsBufferPool.write(HintsBufferPool.java:61) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.hints.HintsService.write(HintsService.java:154) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.service.StorageProxy$11.runMayThrow(StorageProxy.java:2627) ~[apache-cassandra-3.0.11.jar:3.0.11] at org.apache.cassandra.service.StorageProxy$HintRunnable.run(StorageProxy.java:2545) ~[apache-cassandra-3.0.11.jar:3.0.11] ... 5 common frames omitted
Relevant configurations from cassandra.yaml:
-cassandra_hinted_handoff_throttle_in_kb: 1024 cassandra_max_hints_delivery_threads: 4 -cassandra_hints_flush_period_in_ms: 10000 -cassandra_max_hints_file_size_in_mb: 512
When I reduce -cassandra_hints_flush_period_in_ms: 10000 to 5000, the number of exceptions lowers significantly, but they are still present.
Attachments
Issue Links
- duplicates
-
CASSANDRA-13339 java.nio.BufferOverflowException: null
- Resolved
- is related to
-
CASSANDRA-13339 java.nio.BufferOverflowException: null
- Resolved