Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-8257

Parquet writer sometimes hits DCHECK when handling empty string

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 2.13.0, Impala 3.1.0, Impala 3.2.0
    • Impala 3.2.0
    • Backend

    Description

      Encountered while doing a large insert into Parquet.

      create table customer like tpcds_300_text.customer stored as parquetfile
      insert overwrite table customer select * from tpcds_300_text.customer
      
      F0227 01:34:53.052708 131295 parquet-column-stats.inline.h:213] 794c051ae3f3913c:71f00bc400000001] Check failed: static_cast<void*>(prev_page_min_value_.ptr) != static_cast<void*>(cs->min_value_.ptr) (0 vs. 0) 
      *** Check failure stack trace: ***
          @          0x47ec7ec  google::LogMessage::Fail()
          @          0x47ee091  google::LogMessage::SendToLog()
          @          0x47ec1c6  google::LogMessage::Flush()
          @          0x47ef78d  google::LogMessageFatal::~LogMessageFatal()
          @          0x27e973c  impala::ColumnStats<>::Merge()
          @          0x27e3c74  impala::HdfsParquetTableWriter::BaseColumnWriter::FinalizeCurrentPage()
          @          0x27ee65f  impala::HdfsParquetTableWriter::BaseColumnWriter::AppendRow()
          @          0x27e653b  impala::HdfsParquetTableWriter::AppendRows()
          @          0x23177fc  impala::HdfsTableSink::WriteRowsToPartition()
          @          0x231aeeb  impala::HdfsTableSink::Send()
          @          0x1f53888  impala::FragmentInstanceState::ExecInternal()
          @          0x1f4fefa  impala::FragmentInstanceState::Exec()
          @          0x1f63333  impala::QueryState::ExecFInstance()
          @          0x1f61615  _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
          @          0x1f64774  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
          @          0x1d76b9f  boost::function0<>::operator()()
          @          0x22245ee  impala::Thread::SuperviseThread()
          @          0x222c972  boost::_bi::list5<>::operator()<>()
          @          0x222c896  boost::_bi::bind_t<>::operator()()
          @          0x222c859  boost::detail::thread_data<>::run()
          @          0x3716329  thread_proxy
          @     0x7fba207e8dd4  start_thread
          @     0x7fba20511eac  __clone
      

      This actually happened on multiple machines at almost exactly the same time:

      Running on machine: vc1328.halxg.cloudera.com
      Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
      F0227 01:34:53.025667 133025 parquet-column-stats.inline.h:213] 794c051ae3f3913c:71f00bc400000005] Check failed: static_cast<void*>(prev_page_min_value_.ptr) != static_cast<void*>(cs->min_value_.ptr) (0 vs. 0) 
      ...
      F0227 01:34:53.025352 131082 parquet-column-stats.inline.h:213] 794c051ae3f3913c:71f00bc400000007] Check failed: static_cast<void*>(prev_page_min_value_.ptr) != static_cast<void*>(cs->min_value_.ptr) (0 vs. 0) 
      

      Coordinator log indicates it failed very fast:

      I0227 01:34:48.157472 147928 impala-server.cc:1063] 794c051ae3f3913c:71f00bc400000000] Registered query query_id=794c051ae3f3913c:71f00bc400000000 session_id=3345ef7013ba6bb2:55d8d105d3690a8e
      I0227 01:34:48.157711 147928 Frontend.java:1251] 794c051ae3f3913c:71f00bc400000000] Analyzing query: insert overwrite table customer select * from tpcds_300_text.customer db: tpcds_300_decimal_parquet
      I0227 01:34:48.158025 147928 FeSupport.java:285] 794c051ae3f3913c:71f00bc400000000] Requesting prioritized load of table(s): tpcds_300_decimal_parquet.customer
      I0227 01:34:52.049566 147928 Frontend.java:1292] 794c051ae3f3913c:71f00bc400000000] Analysis finished.
      I0227 01:34:52.067458 147991 admission-controller.cc:627] 794c051ae3f3913c:71f00bc400000000] Schedule for id=794c051ae3f3913c:71f00bc400000000 in pool_name=root.systest per_host_mem_estimate=1.62 GB PoolConfig: max_requests=-1 max_queued=200 max_mem=-1.00 B
      I0227 01:34:52.067562 147991 admission-controller.cc:632] 794c051ae3f3913c:71f00bc400000000] Stats: agg_num_running=0, agg_num_queued=0, agg_mem_reserved=0,  local_host(local_mem_admitted=0, num_admitted_running=0, num_queued=0, backend_mem_reserved=0)
      I0227 01:34:52.067620 147991 admission-controller.cc:664] 794c051ae3f3913c:71f00bc400000000] Admitted query id=794c051ae3f3913c:71f00bc400000000
      I0227 01:34:52.067771 147991 coordinator.cc:93] 794c051ae3f3913c:71f00bc400000000] Exec() query_id=794c051ae3f3913c:71f00bc400000000 stmt=insert overwrite table customer select * from tpcds_300_text.customer
      I0227 01:34:52.068926 147991 coordinator.cc:359] 794c051ae3f3913c:71f00bc400000000] starting execution on 9 backends for query_id=794c051ae3f3913c:71f00bc400000000
      I0227 01:34:52.070919 47659 impala-internal-service.cc:50] 794c051ae3f3913c:71f00bc400000000] ExecQueryFInstances(): query_id=794c051ae3f3913c:71f00bc400000000 coord=vc1326.halxg.cloudera.com:22000 #instances=1
      I0227 01:34:52.071800 147994 query-state.cc:624] 794c051ae3f3913c:71f00bc400000003] Executing instance. instance_id=794c051ae3f3913c:71f00bc400000003 fragment_idx=0 per_fragment_instance_idx=3 coord_state_idx=0 #in-flight=1
      I0227 01:34:52.072952 147991 coordinator.cc:373] 794c051ae3f3913c:71f00bc400000000] started execution on 9 backends for query_id=794c051ae3f3913c:71f00bc400000000
      I0227 01:34:52.074553 147992 coordinator.cc:611] Coordinator waiting for backends to finish, 9 remaining. query_id=794c051ae3f3913c:71f00bc400000000
      F0227 01:34:52.949759 147994 parquet-column-stats.inline.h:213] 794c051ae3f3913c:71f00bc400000003] Check failed: static_cast<void*>(prev_page_min_value_.ptr) != static_cast<void*>(cs->min_value_.ptr) (0 vs. 0) 
      

      Attachments

        Issue Links

          Activity

            People

              boroknagyz Zoltán Borók-Nagy
              tarmstrong Tim Armstrong
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: