Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-47361 Improve JDBC data sources
  3. SPARK-47707

Special handling of JSON type for MySQL Connector/J 5.x

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.4.0
    • 4.0.0
    • SQL
    • mysql-connector-java-5.1.49.jar

      spark-3.5.0

    Description

      MySQL JDBC driver `mysql-connector-java-5.1.49.jar` converts JSON type into Types.CHAR with a precision of Int.Max.
      When receiving CHAR with Int.Max precision, Spark executor will throw an error of `java.lang.OutOfMemoryError: Requested array size exceeds VM limit `.

      For mysql-connector-java-5.1.49.jar json sqlType is Char and precision is Int.Max.
      For mysql-connector-java-8.0.16.jar json sqlType is LONGVARCHAR and precision is Int.Max.

      Spark use mysql-connector-java-8.0.16.jar is right.

      private def getCatalystType(
            sqlType: Int,
            typeName: String,
            precision: Int,
            scale: Int,
            signed: Boolean,
            isTimestampNTZ: Boolean): DataType = sqlType match{     
          ...     
          case java.sql.Types.LONGNVARCHAR => StringType    
          ...   
      }
      

       

      If compatibility with 5.1.49 is not required, the current code is sufficient.

      Attachments

        Issue Links

          Activity

            People

              kwafor 王俊博
              kwafor 王俊博
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: