Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27211

cast error when select column from Row

    XMLWordPrintableJSON

Details

    • Question
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 2.3.0, 2.3.1
    • None
    • Java API

    Description

      1.First, I have an object RawLogPlayload which has an field: long timestamp

      2.Then I try to join two Dataset<RawLogPlayload> and select some of the columns

      Following is the code Snippet

      extractedRawTc.printSchema();   // output1

      Dataset<RawLogPayload> extractedRawW3cFilled = extractedRawW3c.alias("extractedRawW3c")

      .join(extractedRawTc.alias("extractedRawTc"), functions.col("extractedRawW3c.rawsessionid").equalTo(functions.col("extractedRawTc.rawsessionid")), "inner")

      .select(functions.col("extractedRawW3c.df_logdatetime"), functions.col("extractedRawW3c.rawsessionid"), functions.col("extractedRawTc.uid"),

      functions.col("extractedRawW3c.time"),functions.col("extractedRawW3c.T"),functions.col("extractedRawW3c.url"),functions.col("extractedRawW3c.wid"),

      functions.col("extractedRawW3c.tid"), functions.col("extractedRawW3c.fid"),functions.col("extractedRawW3c.string1"),

      functions.col("extractedRawW3c.curWindow"), functions.col("extractedRawW3c.timestamp"))

      .as(Encoders.bean(RawLogPayload.class));

      extractedRawW3cFilled.printSchema();  // output2

       

      3. After run this, it will cast following exception

      2019-03-20 15:28:31 ERROR CodeGenerator:91 ## failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 103, Column 32: No applicable constructor/method found for actual parameters "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void com.microsoft.datamining.spartan.api.core.RawLogPayload.setTimestamp(long)"

      org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 103, Column 32: No applicable constructor/method found for actual parameters "org.apache.spark.unsafe.types.UTF8String"; candidates are: "public void com.xxxx.xxxx.spartan.api.core.RawLogPayload.setTimestamp(long)"

      at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:11821)

      at org.codehaus.janino.UnitCompiler.findMostSpecificIInvocable(UnitCompiler.java:8910)

       

      Output1 extractedRawTc schema

      root

       |-- curWindow: string (nullable = true)

       |-- df_logdatetime: string (nullable = true)

       |-- fid: string (nullable = true)

       |-- rawsessionid: string (nullable = true)

       |-- string1: string (nullable = true)

       |-- t: string (nullable = true)

       |-- tid: string (nullable = true)

       |-- time: string (nullable = true)

       |-- timestamp: long (nullable = true)

       |-- uid: string (nullable = true)

       |-- url: string (nullable = true)

       |-- wid: string (nullable = true)

       

      Output2  extractedRawW3cFilled schema

      root

       |-- df_logdatetime: string (nullable = true)

       |-- rawsessionid: string (nullable = true)

       |-- uid: string (nullable = true)

       |-- time: string (nullable = true)

       |-- T: string (nullable = true)

       |-- url: string (nullable = true)

       |-- wid: string (nullable = true)

       |-- tid: string (nullable = true)

       |-- fid: string (nullable = true)

       |-- string1: string (nullable = true)

       |-- curWindow: string (nullable = true)

       |-- timestamp: long (nullable = true)

       

      My question: the schema of column timestamp is long, but from the exception log, it seems after selecting the datatype of timestamp becomes UTF8String, Why would this happen? Is it a bug? If not could you point how to use it correctly?

      Thanks

      Attachments

        Activity

          People

            Unassigned Unassigned
            zhanzhan18 Guiju Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: