Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5871

Large files fail to write to s3 datastore using hdfs s3a.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.11.0
    • None
    • Centos 7.4, Oracle Java SE 1.80.0_131-b11, x86_64, vmware. Zookeeper cluster, two drillbits, 3 zookeepers.

    Description

      When storing CSV files to a S3a storage driver using a CTAS, if the files are large enough to implicate the multi-part upload functionality, the CTAS fails with the following stack trace (we can write smaller CSV's and parquet files no problem):
      Error: SYSTEM ERROR: UnsupportedOperationException

      Fragment 0:0

      [Error Id: dbb018ea-29eb-4e1a-bf97-4c2c9cfbdf3c on den-certdrill-1.ci.neoninternal.org:31010]

      (java.lang.UnsupportedOperationException) null
      java.util.Collections$UnmodifiableList.sort():1331
      java.util.Collections.sort():175
      com.amazonaws.services.s3.model.transform.RequestXmlFactory.convertToXmlByteArray():42
      com.amazonaws.services.s3.AmazonS3Client.completeMultipartUpload():2513
      org.apache.hadoop.fs.s3a.S3AFastOutputStream$MultiPartUpload.complete():384
      org.apache.hadoop.fs.s3a.S3AFastOutputStream.close():253
      org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close():72
      org.apache.hadoop.fs.FSDataOutputStream.close():106
      java.io.PrintStream.close():360
      org.apache.drill.exec.store.text.DrillTextRecordWriter.cleanup():170
      org.apache.drill.exec.physical.impl.WriterRecordBatch.closeWriter():184
      org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext():128
      org.apache.drill.exec.record.AbstractRecordBatch.next():162
      org.apache.drill.exec.record.AbstractRecordBatch.next():119
      org.apache.drill.exec.record.AbstractRecordBatch.next():109
      org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
      org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133
      org.apache.drill.exec.record.AbstractRecordBatch.next():162
      org.apache.drill.exec.physical.impl.BaseRootExec.next():105
      org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
      org.apache.drill.exec.physical.impl.BaseRootExec.next():95
      org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234
      org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227
      java.security.AccessController.doPrivileged():-2
      javax.security.auth.Subject.doAs():422
      org.apache.hadoop.security.UserGroupInformation.doAs():1657
      org.apache.drill.exec.work.fragment.FragmentExecutor.run():227
      org.apache.drill.common.SelfCleaningRunnable.run():38
      java.util.concurrent.ThreadPoolExecutor.runWorker():1142
      java.util.concurrent.ThreadPoolExecutor$Worker.run():617
      java.lang.Thread.run():748 (state=,code=0)

      This looks suspiciously like:
      https://issues.apache.org/jira/browse/HADOOP-14204

      So the fix may be as 'simple' as just syncing the upstream version when Hadoop 2.8.2 releases later this month. Although I am ignorant to the implications of upgrading hadoop-hdfs to this version.

      We are able to store smaller files just fine.
      Things I've tried:
      Setting fs.s3a.multipart.threshold to a ridiculously large value like 10T (these files are just over 1GB). Does not work.
      Setting fs.s3a.fast.upload: false. Also does not change the behavior.

      The s3a driver does not appear to have an option to disable multi-part uploads all together.

      For completeness sake here are my current S3a options for the driver:
      "fs.s3a.endpoint": "******",
      "fs.s3a.access.key": "*",
      "fs.s3a.secret.key": "*",
      "fs.s3a.connection.maximum": "200",
      "fs.s3a.paging.maximum": "1000",
      "fs.s3a.fast.upload": "true",
      "fs.s3a.multipart.purge": "true",
      "fs.s3a.fast.upload.buffer": "bytebuffer",
      "fs.s3a.fast.upload.active.blocks": "8",
      "fs.s3a.buffer.dir": "/opt/apache-airflow/buffer",
      "fs.s3a.multipart.size": "134217728",
      "fs.s3a.multipart.threshold": "671088640",
      "fs.s3a.experimental.input.fadvise": "sequential",
      "fs.s3a.acl.default": "PublicRead",
      "fs.s3a.multiobjectdelete.enable": "true"

      Attachments

        Activity

          People

            Unassigned Unassigned
            steveatbat Steve Jacobs
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: