Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3938

Semantically incorrect query referencing nested collections is accepted and run (with arbitrary results).

    XMLWordPrintableJSON

Details

    Description

      In some situations, Impala accepts a semantically incorrect query that references nested collections. The returned results are arbitrary.

      Example and repro
      Setup:
      In Hive create and populate this table:

      create table dummy (x int);
      insert into dummy values(10);
      
      create table test_table(id int, locations array<map<string,string>>) stored as parquet;
      
      insert into table test_table select 10001, array( map("Location1.City","Bratislava","Location1.Country","SK","Location1.LAT","41"), map("Location2.City","Kosice","Location2.Country","SK","Location2.LAT","42") ) from dummy limit 1;
      
      insert into table test_table select 10002, array( map("Location1.City","Wien","Location1.Country","AT","Location1.LAT","40"), map("Location2.City","Graz","Location2.Country","AT","Location2.LAT","40") ) from dummy limit 1;
      

      Impala will incorrectly execute this invalid query:

      select t.id, l.pos, l.key, l.value
      from test_table t join t.locations l
      order by id, l.pos;
      

      The above query is semantically not correct because "l.key" and "l.value" are fields in the nested map in "l" but the map itself is not referenced in the FROM clause. The query should fail semantic analysis.

      The correct query references the nested map in the FROM clause:

      select t.id, l.pos, m.key, m.value
      from test_table t join t.locations l join l.item m
      order by id, l.pos;
      +-------+-----+-------------------+------------+
      | id    | pos | key               | value      |
      +-------+-----+-------------------+------------+
      | 10001 | 0   | Location1.Country | SK         |
      | 10001 | 0   | Location1.City    | Bratislava |
      | 10001 | 0   | Location1.LAT     | 41         |
      | 10001 | 1   | Location2.LAT     | 42         |
      | 10001 | 1   | Location2.Country | SK         |
      | 10001 | 1   | Location2.City    | Kosice     |
      | 10002 | 0   | Location1.Country | AT         |
      | 10002 | 0   | Location1.City    | Wien       |
      | 10002 | 0   | Location1.LAT     | 40         |
      | 10002 | 1   | Location2.LAT     | 40         |
      | 10002 | 1   | Location2.Country | AT         |
      | 10002 | 1   | Location2.City    | Graz       |
      +-------+-----+-------------------+------------+
      

      Attachments

        Activity

          People

            cchanning Chris Channing
            alex.behm Alexander Behm
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: