Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-17970

[Ruby] Parsing escaped inner quotations incorrectly.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 9.0.0
    • None
    • Ruby
    • None
    • M1 macbook running ruby 3.0.3

    Description

      When using the CSVReader for a value with inner quotations escaped by backslashes the value is incorrectly parsed.

      When I use

      table = Arrow::MemoryMappedInputStream.open(file.path) do |input|
          Arrow::CSVReader.new(input, options).read
      end

      On a row such as

      "Some value", "Another \"value\" quotations", "Last value"

      It outputs 

      Some value
      Another value" quotations"
      Last value

      When my expected output is

      Some value
      Another "value" quotations
      Last value

      I've tried plenty of different options in different combinations, mainly the three below ones though. With, without and in different combinations and values. 

      options.quoted = true
      options.double_quoted = false (tried with true)
      options.escape_character = 92

      I assume this is a bug.

      Attachments

        Activity

          People

            Unassigned Unassigned
            danielsalomons Daniel Salomons
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: