Details
-
Improvement
-
Status: Resolved
-
P3
-
Resolution: Fixed
-
None
-
None
Description
The following tests failed when I tried to upgrade google-http-client 1.34.0 from 1.28.0:
org.apache.beam.sdk.io.gcp.bigquery.BigQueryIOReadTest.testEstimatedSizeWithoutStreamingBuffer org.apache.beam.sdk.io.gcp.bigquery.BigQueryIOReadTest.testEstimatedSizeWithStreamingBuffer org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtilTest.testInsertAll
https://builds.apache.org/job/beam_PreCommit_Java_Commit/9288/#showFailuresLink
Reason of the test failures
org.apache.beam.sdk.io.gcp.testing.TableContainer and org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl rely on TableRow.toString().length() to calculate the size. Example:
dataSize += row.toString().length();
if (dataSize >= maxRowBatchSize
|| rows.size() >= maxRowsPerBatch
|| i == rowsToPublish.size() - 1) {
However, with google-http-client's PR#589, the GenericData.toString output has changed since v1.29.0.
In old google-http-client 1.28.0, an example row's toString returned:
{f=[{v=foo}, {v=1234}]}
In new google-http-client 1.29.0 and higher, the same row's toString returns:
GenericData{classInfo=[f], {f=[GenericData{classInfo=[v], {v=foo}}, GenericData{classInfo=[v], {v=1234}}]}}
Question:
Is this right thing to rely on toString().length() in the BigQuery classes?
Attachments
Attachments
Issue Links
- links to