Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-15552

[Docs][Format] Unclear wording about base64 encoding requirement of metadata values

    XMLWordPrintableJSON

Details

    Description

      The C Data Interface docs indicate that the values in key-value metadata should be base64 encoded, which is mentioned in the section about which key-value metadata to use for extension types (https://arrow.apache.org/docs/format/CDataInterface.html#extension-arrays):

      The base64 encoding of metadata values ensures that any possible serialization is representable.

      This might not be fully correct, though (or at least not required, which is implied with the current wording). While a binary blob (like a serialized schema) can be base64 encoded, as we do when putting the Arrow schema in the Parquet metadata, this is not required?

      cc apitrou

      Attachments

        Issue Links

          Activity

            People

              apitrou Antoine Pitrou
              jorisvandenbossche Joris Van den Bossche
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 50m
                  1h 50m