Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-11522

input_file_name() returns "" for external tables

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.5.1
    • 1.6.0
    • SQL

    Description

      Given an external table definition where the data consists of many CSV files, input_file_name() returns empty strings.

      Table definition:

      CREATE EXTERNAL TABLE external_test(page_id INT, impressions INT) 
      ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
      WITH SERDEPROPERTIES (
         "separatorChar" = ",",
         "quoteChar"     = "\"",
         "escapeChar"    = "\\"
      )  
      LOCATION 'file:///Users/sim/spark/test/external_test'
      

      Query:

      sql("SELECT input_file_name() as file FROM external_test").show
      

      Output:

      +----+
      |file|
      +----+
      |    |
      |    |
      ...
      |    |
      +----+
      

      Attachments

        Activity

          People

            xwu0226 Xin Wu
            simeons Simeon Simeonov
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: