XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Transactions
    • None

    Description

      How does this handle tables that are bucketed + sorted?
      insert into T values(1,2),(5,6); creates something like delta_2_2/bucket_1
      insert into T values(3,4),(7,8) creates delta_3_3/bucket_1

      the expectation for any reader would be to see some contiguous subset of (1,2),(3,4),(5,6),(7,8)

      but this would require a special reader which I don't see

      In particular it's not clear how SMB join can work

      This looks like a general problem:
      For plain Hive table, if you do 2 inserts, and the 1st one creates 00000_0, then 2nd one will create 00000_0_copy_1.
      There is nothing merge these files at query time to produce a single sort order (like Acid reader in full acid tables)
      It should at least throw in this case.

      Current "CONCATENATE" doesn't support bucketed or sorted tables.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ekoifman Eugene Koifman
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: