Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
`Arrow::Table#join` returns columns with duplicate keys. Duplicate column names are acceptable in Arrow, but it is preferable to use one.
Also with `type: :full_outer`, column data should be merged.
table1
=>
#<Arrow::Table:0x7f9706109380 ptr=0x55a91a4cac10>
KEY X
0 A 1
1 B 2
2 C 3
table2
=>
#<Arrow::Table:0x7f970415d2c0 ptr=0x55a91a348ce0>
KEY X
0 A 4
1 B 5
2 D 6
Should omit `:KEY` in right
table1.join(table2, :KEY)
=>
#<Arrow::Table:0x7f96fd152548 ptr=0x55a91af21110>
KEY X KEY X
0 A 1 A 4
1 B 2 B 5
Should merge `:KEY`s
table1.join(table2, :KEY, type: :full_outer)
=>
#<Arrow::Table:0x7f96fd0e1550 ptr=0x55a91a1a6410>
KEY X KEY X
0 A 1 A 4
1 B 2 B 5
2 C 3 (null) (null)
3 (null) (null) D 6