Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.9.2
-
None
-
None
Description
Right now, the Pig HBaseStorage writes Puts directly into HBase. This is slow for bulk operations (such as the ones Pig exactly does). The Puts/Deletes are more meant for realtime operations, so it would be nice if Pig had an automatic mechanism to prepare bulkloadable HFiles for the target table, and bulkload it in right at the end of the job.
For compatibility reasons, this can be optional and turned off by default until it is agreed that this must be default (but can continue to provide a turn-off option).
Attachments
Issue Links
- is related to
-
PIG-3067 HBaseStorage should be split up to become more manageable
- Open