Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-16352

Ability to skip or repair out of sync blocks with HIVE at runtime

    XMLWordPrintableJSON

Details

    Description

      When a file is corrupted it raises the error java.io.IOException: Invalid sync! with hive.
      Can we have some functionality to skip or repair such blocks at runtime to make avro more error resilient in case of data corruption.
      Error: java.io.IOException: java.io.IOException: java.io.IOException: While processing file s3n://<bucket>/navdeepp/warehouse/avro_test/354dc34474404f4bbc0d8013fc8e6e4b_000042. java.io.IOException: Invalid sync!
      at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
      at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
      at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:334)

      Attachments

        Activity

          People

            gabry.wu gabrywu
            navdeepniku Navdeep Poonia
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 2h 20m
                2h 20m