Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20200

Huge performance gap when processing ORC files created by Spark

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.1.0
    • None
    • Hive, ORC
    • None

    Description

      Seeing a huge performance difference while running a simple filter query on ORC files created by Spark. I'm seeing better performance if the files are written by Hive i.e. after doing a "create table x as select * from y". 

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            vinoths Vinoth Sathappan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: