Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9163

Log more information about duplicate row errors when inserting into Kudu

    XMLWordPrintableJSON

Details

    • Wish
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Backend
    • ghx-label-2

    Description

      When inserting rows into a Kudu table wherein some row keys already exist, depending on the expectations of the dataset, it might be nice to know which write operations were rejected due to a duplicate key error. Today, when inserting such rows through Hue or Impala shell, users are met with a more general error:

      Key already present in Kudu table 'default.loadgen_auto_157eac2da1dc4df2824c9a1d51bb3f3f'.

      While this nicely avoids excessive error messages (per IMPALA-3704) when there are many duplicate rows, in cases where few duplicate rows are expected, knowing exactly which rows offended the uniqueness constraint might be nice instead. Not sure what exact form factor this might take, but it seems like it'd be a usability win in some cases.

      Attachments

        Activity

          People

            Unassigned Unassigned
            awong Andrew Wong
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: