Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12712

HiveInputFormat may fail to column names to read in some cases

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.0, 2.1.0
    • 1.3.0, 2.0.0
    • None
    • None

    Description

      The primary issue is when plan is generated pathToAliases map is populated with directory paths to table aliases. pathToAliases.put() uses path.toString() as map key. During probing, path.toUri().toString() is used. This can cause probe misses when path contains spaces in them. path.toUri() will escape the spaces in the path whereas path.toString() does not escape the spaces. As a result, HiveInputFormat can trigger a different code path which can fail to set list of columns to read from the source table. This was causing unexpected NPE in OrcInputFormat (after refactoring HIVE-11705) which removed null check for column names. The resulting exception is

      Caused by: java.lang.RuntimeException: ORC split generation failed with exception: java.lang.NullPointerException
              at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1288)
              at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1354)
              at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:367)
              at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:457)
              at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:152)
              at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:246)
              at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:240)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:422)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
              at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:240)
              at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:227)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              ... 3 more
      Caused by: java.util.concurrent.ExecutionException: java.lang.NullPointerException
              at java.util.concurrent.FutureTask.report(FutureTask.java:122)
              at java.util.concurrent.FutureTask.get(FutureTask.java:192)
              at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1282)
              ... 15 more
      Caused by: java.lang.NullPointerException
              at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.extractNeededColNames(OrcInputFormat.java:422)
              at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.extractNeededColNames(OrcInputFormat.java:417)
              at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.access$2000(OrcInputFormat.java:134)
              at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:1072)
              at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:919)
              ... 4 more
      
      

      Attachments

        1. HIVE-12712-branch-1.patch
          4 kB
          Prasanth Jayachandran
        2. HIVE-12712.2.patch
          6 kB
          Prasanth Jayachandran
        3. HIVE-12712.1.patch
          5 kB
          Prasanth Jayachandran

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            prasanth_j Prasanth Jayachandran Assign to me
            taksaito Takahiko Saito
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment