Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25660

Impossible to use the backward slash as the CSV fields delimiter

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.4.0
    • 2.4.0
    • SQL
    • None

    Description

      If fields in CSV input are delimited by '\', for example:

      123\4\5\1\Q\\P\P\2321213\1\\\P\\F
      

      reading it by the code:

      df = spark.read.format('csv').option("header","false").options(delimiter='\\').load("file:///file.csv")
      

      causes the exception:

      String index out of range: 1
      java.lang.StringIndexOutOfBoundsException: String index out of range: 1
      	at java.lang.String.charAt(String.java:658)
      	at org.apache.spark.sql.execution.datasources.csv.CSVUtils$.toChar(CSVUtils.scala:101)
      	at org.apache.spark.sql.execution.datasources.csv.CSVOptions.<init>(CSVOptions.scala:86)
      	at org.apache.spark.sql.execution.datasources.csv.CSVOptions.<init>(CSVOptions.scala:41)
      	at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:488)
      

      Attachments

        Activity

          People

            maxgekk Max Gekk
            maxgekk Max Gekk
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: