Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23316

AnalysisException after max iteration reached for IN query

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.0, 2.4.0
    • 2.3.0
    • SQL
    • None

    Description

      Query to reproduce:

      spark.range(10).where("(id,id) in (select id, null from range(3))").show
      
      18/02/02 11:32:31 WARN BaseSessionStateBuilder$$anon$1: Max iterations (100) reached for batch Resolution
      org.apache.spark.sql.AnalysisException: cannot resolve '(named_struct('id', `id`, 'id', `id`) IN (listquery()))' due to data type mismatch:
      The data type of one or more elements in the left hand side of an IN subquery
      is not compatible with the data type of the output of the subquery
      Mismatched columns:
      []
      Left side:
      [bigint, bigint].
      Right side:
      [bigint, bigint].;;
      

      The error message includes the last plan which contains ~100 useless Projects.
      Does not happen in branch-2.2.
      It has something to do with TypeCoercion, it is doing a futile attempt to change nullability.

      Attachments

        Activity

          People

            bograd Bogdan Raducanu
            bograd Bogdan Raducanu
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: