Details
-
Improvement
-
Status: Open
-
Not a Priority
-
Resolution: Unresolved
-
1.12.0
-
None
Description
Currently, there is a validation in StreamingJobGraphGenerator that fails if an operator implements InputSelectable and checkpointing is enabled.
One issue when removing this validation is that with Unaligned checkpoints recovery would deadlock if there are records spanning multiple buffers.
Consider the following case:
- Two {{IntputGate}}s
- Input selection is not ALL (say FIRST initially)
- Unaligned Checkpoints ON
- on recovery, there are "parts" of records in all channels (actually 1 is enough I think)
On recovery,
1. StreamTask initiates recovery and scedule partition request upon it's end
2. All gates and channels will receive buffers from StateReader
3. All channels of a single gate will consume those state buffers - completing that gate's StateConsumedFuture
4. InputProcessor will return NOTHING_AVAILABLE (see StreamTwoInputProcessor.getInputStatus)
5. StreamTask will suspend its default action
6. State of the 2nd gate won't be consumed - so its {{StateConsumedFuture}}s won't be completed - so no partitions will be requested (edited)
A simple solution is to request partitions for each channel independently.
Attachments
Issue Links
- relates to
-
FLINK-17122 Support InputSelectable and BoundedMultiInput operators with checkpointing
- Open