Details
-
New Feature
-
Status: Open
-
P3
-
Resolution: Unresolved
-
None
-
None
-
None
Description
KafkaIO could be a useful source for batch applications as well. It could implement a bounded source. The primary question is how the bounds are specified.
One option : Source specifies a time period (say 9am-10am), and KafkaIO fetches appropriate start and end offsets based on time-index in Kafka. This would suite many batch applications that are launched on a scheduled.
Another option is to always read till the end and commit the offsets to Kafka. Handling failures and multiple runs of a task might be complicated.