Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4679

CONVERT_FROM() json format fails if 0 rows are received from upstream operator

    XMLWordPrintableJSON

Details

    Description

      CONVERT_FROM() json format fails as below if the underlying Filter produces 0 rows:

      0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'json') as x from cp.`tpch/region.parquet` where r_regionkey = 9999;
      Error: SYSTEM ERROR: IllegalStateException: next() returned NONE without first returning OK_NEW_SCHEMA [#16, ProjectRecordBatch]
      
      Fragment 0:0
      

      If the conversion is applied as UTF8 format, the same query succeeds:

      0: jdbc:drill:zk=local> select convert_from('{"abc":"xyz"}', 'utf8') as x from cp.`tpch/region.parquet` where r_regionkey = 9999;
      +----+
      | x  |
      +----+
      +----+
      No rows selected (0.241 seconds)
      

      The reason for this is the special handling in the ProjectRecordBatch for JSON. The output schema is not known for this until the run time and the ComplexWriter in the Project relies on seeing the input data to determine the output schema - this could be a MapVector or ListVector etc.

      If the input data has 0 rows due to a filter condition, we should at least produce a default output schema, e.g an empty MapVector ? Need to decide a good default. Note that the CONVERT_FROM(x, 'json') could occur on 2 branches of a UNION-ALL and if one input is empty while the other side is not, it may still cause incompatibility.

      Attachments

        Issue Links

          Activity

            People

              amansinha100 Aman Sinha
              amansinha100 Aman Sinha
              Chun Chang Chun Chang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: