Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27473

Support filter push down for status fields in binary file data source

    XMLWordPrintableJSON

Details

    • Documentation
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.0.0
    • SQL
    • None

    Description

      As a user, I can use `spark.read.format("binaryFile").load(path).filter($"status.lenght" < 100000000L)` to load files that are less than 1e8 bytes. Spark shouldn't even read files that are bigger than 1e8 bytes in this case.

      Attachments

        Issue Links

          Activity

            People

              weichenxu123 Weichen Xu
              mengxr Xiangrui Meng
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: