Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27712

createDataFrame() reorders row

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 2.4.0
    • None
    • PySpark
    • emr-5.20.0

      PySpark 2.4.0

      Python 2.7.15

    Description

      Executing  the following:

      my_schema = pyspark.sql.types.StructType([
          pyspark.sql.types.StructField("B", pyspark.sql.types.StringType(), True),
          pyspark.sql.types.StructField("A", pyspark.sql.types.StringType(), True)
      ])
      
      spark.createDataFrame(spark.sparkContext.parallelize([pyspark.sql.Row(A="1", B="2")]), my_schema).collect()
      

      should produce this:

      [Row(A="1", B="2")]
      

      or this:

      [Row(B='2', A='1')]
      

      but produces this instead:

      [Row(B=u'1', A=u'2')]
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tludwinski Tim Ludwinski
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: