Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18352

Parse normal, multi-line JSON files (not just JSON Lines)

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.2.0
    • SQL

    Description

      Spark currently can only parse JSON files that are JSON lines, i.e. each record has an entire line and records are separated by new line. In reality, a lot of users want to use Spark to parse actual JSON files, and are surprised to learn that it doesn't do that.

      We can introduce a new mode (wholeJsonFile?) in which we don't split the files, and rather stream through them to parse the JSON files.

      Attachments

        Issue Links

          There are no Sub-Tasks for this issue.

          Activity

            People

              NathanHowell Nathan Howell
              rxin Reynold Xin
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: