Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18352

Parse normal, multi-line JSON files (not just JSON Lines)

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.2.0
    • SQL

    Description

      Spark currently can only parse JSON files that are JSON lines, i.e. each record has an entire line and records are separated by new line. In reality, a lot of users want to use Spark to parse actual JSON files, and are surprised to learn that it doesn't do that.

      We can introduce a new mode (wholeJsonFile?) in which we don't split the files, and rather stream through them to parse the JSON files.

      Attachments

        Issue Links

          Activity

            People

              NathanHowell Nathan Howell
              rxin Reynold Xin
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: