XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.0.2, 2.1.0
    • Structured Streaming
    • None

    Description

      Right now you can only run a Streaming Query starting from either the earliest or latests offsets available at the moment the query is started. Sometimes this is a lot of data. It would be nice to be able to do the following:

      • seek to user specified offsets for manually specified topicpartitions

      currently agreed on plan:

      Mutually exclusive subscription options (only assign is new to this ticket)

      .option("subscribe","topicFoo,topicBar")
      .option("subscribePattern","topic.*")
      .option("assign","""{"topicfoo": [0, 1],"topicbar": [0, 1]}""")
      

      where assign can only be specified that way, no inline offsets

      Single starting position option with three mutually exclusive types of value

      .option("startingOffsets", "earliest" | "latest" | """{"topicFoo": {"0": 1234, "1": -2}, "topicBar":{"0": -1}}""")
      

      startingOffsets with json fails if any topicpartition in the assignments doesn't have an offset.

      Attachments

        Activity

          People

            koeninger Cody Koeninger
            marmbrus Michael Armbrust
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: