Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44101 Support pandas 2
  3. SPARK-43194

PySpark 3.4.0 cannot convert timestamp-typed objects to pandas with pandas 2.0

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.0.0
    • 4.0.0
    • PySpark
    • None

    Description

      In [1]: from pyspark.sql import SparkSession
      
      In [2]: session = SparkSession.builder.appName("test").getOrCreate()
      23/04/19 09:21:42 WARN Utils: Your hostname, albatross resolves to a loopback address: 127.0.0.2; using 192.168.1.170 instead (on interface enp5s0)
      23/04/19 09:21:42 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
      Setting default log level to "WARN".
      To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
      23/04/19 09:21:42 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      
      In [3]: session.sql("select now()").toPandas()
      

      Results in:

      ...
      TypeError: Casting to unit-less dtype 'datetime64' is not supported. Pass e.g. 'datetime64[ns]' instead.
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            cpcloud Phillip Cloud
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: