Details
-
Bug
-
Status: Resolved
-
P4
-
Resolution: Fixed
-
2.0.0
-
None
Description
This is the first issue I've raised on Apache's JIRA; if I have made any mistakes in compiling this ticket then I apologise and would welcome any feedback.
When matching a path spec, GcsFileSystem.toMetadata() will sometimes attempt to build an instance of org.apache.beam.sdk.io.fs.MatchResult.Metadata without first setting sizeBytes[1]. This always results in an error in the autovalue-generated builder for MatchResult.Metadata as sizeBytes is a required field[2].
I propose that GcsFileSystem set sizeBytes to 0 when there is no size returned by GCS, which will presumably happen when the path spec refers either to a directory, or to a non-existent file. GcsFileSystem.toMetadata() could be updated as follows:
Before
if (size != null) { ret.setSizeBytes(size.longValue()); }
After
if (size != null) { ret.setSizeBytes(size.longValue()); } else { ret.setSizeBytes(0); }
[1] https://github.com/apache/beam/blob/5bfd3e049c0ca0744165b0243a645e8e427032d5/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/storage/GcsFileSystem.java#L240-L242
[2] https://gist.github.com/joshdifabio/fe543b97e02e7ddac8edb73be38deb06#file-autovalue_matchresult_metadata-java-L102-L110