Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12926

Refactor BINARY type handling in the backend

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • Backend
    • None
    • ghx-label-14

    Description

      Currently the STRING and BINARY types are not distinguished in most of the backend. In contrast to the frontend, PrimitiveType::TYPE_BINARY is not used there at all, TYPE_STRING being used instead. This is to ensure that everything that works for STRING also works for BINARY. So far only file readers and writers have had to handle them differently, and they have access to ColumnDescriptors which contain AuxColumnType fields that differentiate these two types.

      However, only top-level columns have ColumnDescriptors. Adding support or BINARYs within complex types (see IMPALA-11491 and IMPALA-12651) necessitates adding type information about STRING vs BINARY to embedded fields as well.

      Using PrimitiveType::TYPE_BINARY would probably be the cleanest solution but it would affect huge parts of the code as TYPE_BINARY would have to be added to hundreds of switch statements and this would be error prone.

      Instead, we should introduce a new field in ColumnType: 'is_binary', which is true if the type is a BINARY and false otherwise. We keep using TYPE_STRING as the PrimitiveType of the ColumnType for BINARYs. This way full type information is present in ColumnType but code that does not differentiate between STRING and BINARY will continue to work for BINARY.

      With this change, AuxColumnType is no longer needed and should be removed.

      Attachments

        Issue Links

          Activity

            People

              daniel.becker Daniel Becker
              daniel.becker Daniel Becker
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: