Each branch of this multi-insert create a row in partition p1/bucket0 with ROW__ID=(1,0,0).
The same can happen when running SQL Merge (
HIVE-10924) statement that has both Insert and Update clauses when target table has 'transactional'='true','transactional_properties'='default' (see HIVE-14035). This is so because Merge is internally run as a multi-insert statement.
The solution relies on statement ID introduced in
HIVE-11030. Each Insert clause of a multi-insert is gets a unique ID.
The ROW__ID.bucketId now becomes a bit packed triplet (format version, bucketId, statementId).
(Since ORC stores field names in the data file we can't rename ROW__ID.bucketId).
This ensures that there are no collisions and retains desired sort properties of ROW__ID.
In particular SortedDynPartitionOptimizer works w/o any changes even in cases where there fewer reducers than buckets.