Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-18225

[Python] write_metadata does not fully use **kwargs

    XMLWordPrintableJSON

Details

    Description

      When using write_metadata, kwargs can be used to pass a FileSystem to a ParquetWriter. However, those kwargs are not passed to read_metadata later on despite the function accepting a filesystem argument.

      This creates an error when trying to write metadata on a S3FileSystem for example.

      def write_metadata(schema, where, metadata_collector=None, **kwargs):
          writer = ParquetWriter(where, schema, **kwargs)
          writer.close()
      
          if metadata_collector is not None:
              metadata = read_metadata(where) # kwargs should be passed here
              for m in metadata_collector:
                  metadata.append_row_groups(m)
              metadata.write_metadata_file(where) # kwargs should be passed here
      
      def read_metadata(where, memory_map=False, decryption_properties=None,
                        filesystem=None):
          ...

      Attachments

        Issue Links

          Activity

            People

              milesgranger Miles Granger
              fchareyron François Chareyron
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3h 20m
                  3h 20m