Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-27726

No lineage for constant literals

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 4.0.0-beta-1
    • None
    • HiveServer2
    • None

    Description

      Consider the following statement where we create a table based on another table and some constant expressions/literals.

      CREATE TABLE tbl1 AS (SELECT 'Bob', 'Alice', key, value || 'some' FROM src)
      

      Observe that column 0, and column 1 of tbl1 originate from constant literals. Currently (commit e5a7ce2f091da1f8a324da6e489cda59b9e4bfc6), there is no lineage information for columns originating from constants.

      The org.apache.hadoop.hive.ql.hooks.LineageLogger will display the following the aforementioned DDL statement.

      POSTHOOK: Lineage: tbl1._c0 SIMPLE []
      POSTHOOK: Lineage: tbl1._c1 SIMPLE []
      POSTHOOK: Lineage: tbl1._c3 EXPRESSION [(src)src.FieldSchema(name:value, type:string, comment:default), ]
      POSTHOOK: Lineage: tbl1.key SIMPLE [(src)src.FieldSchema(name:key, type:string, comment:default), ]
      

      This is not really a bug since we cannot really say that something is broken but maybe there is a way to reflect that c0 and c1 originate from constants to avoid the misconception that lineage is missing.

      Note, that constant folding may also lead to the same behavior since expressions may be simplified to constants and the latter do not appear in the lineage output.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              zabetak Stamatis Zampetakis
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: