Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9124 Transparently retry queries that fail due to cluster membership changes
  3. IMPALA-9225

Retryable queries should spool all results before returning any to the client

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • Impala 4.0.0
    • None
    • None
    • ghx-label-2

    Description

      If query retries are enabled, a query should not return any results to the client until all results are spooled. The issue is that once a query starts returning results, retrying the query becomes increasingly complex and is not supported in the initial version of IMPALA-9124. Retrying a query while returning results could cause incorrect results, especially for non-deterministic queries (e.g. when the results are not ordered).

      Since a query can fail anytime while results are being produced, transparent retries are most effective if they can be done during any point of query execution.

      The one edge case is what happens if all query results cannot be contained in the allocated result spooling memory (including unpinned memory). In this case, retries for the query should be transparently disabled.

      We should consider making this configurable, in case it leads to performance degradation. Although, I'm inclined to turn the flag on by default (e.g. always spool all returns before returning them), otherwise (depending on the query) query retries won't always be helpful.

      Attachments

        Issue Links

          Activity

            People

              stigahuang Quanlong Huang
              stakiar Sahil Takiar
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: