Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-2387

Too many HEAD requests from Hudi to S3

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Information Provided
    • 0.8.0
    • None
    • Usability
    • AWS Glue with PySpark

    Description

      We are using Apache Hudi from AWS Glue (with PySpark runtime) to store data on S3 bucket. We are observing a very high number of S3 HEAD requests originating from what we believe from Hudi. 

      Many a time due to this high number of requests, S3 throws "Status Code: 503; Error Code: SlowDown" causing data losses. 

      Is there any any out-of-box feature to debug this further to confirm which Hudi feature causing this? 

      Attachments

        Activity

          People

            Unassigned Unassigned
            souravtri Sourav T
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: