Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2867

Load data inpath chokes on impala staging directories

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 2.2.4
    • Impala 2.5.0
    • Frontend

    Description

      In certain workflows, it's useful to do an insert into a temp table and then use the LOAD statement to move the data from the temp table to a main table. This workflow fails due to the _impala_insert_staging table which is a pain to delete and for workflows where this is common, invoking hadoop fs to delete the directory, can consume a large amount of time.

      e.g.

      drwxrwxrwt   - impala hive          0 2016-01-15 16:55 /user/hive/warehouse/test1/_impala_insert_staging
      -rw-r--r--   3 impala hive          3 2016-01-15 16:55 /user/hive/warehouse/test1/b45b4bccab740ab-c119e053e1efd6a2_528952759_data.0
      

      and

      > load data inpath '/user/hive/warehouse/test1/' into table test2;
      Query: load data inpath '/user/hive/warehouse/test1/' into table test2
      ERROR: AnalysisException: INPATH location 'hdfs://nn:8020/user/hive/warehouse/test1' cannot contain subdirectories.
      

      Attachments

        Activity

          People

            flumeqa Flume QA
            flumeqa Flume QA
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: