Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
3.1.0
-
None
-
None
-
Test Environment.
Description
Hi Team,
We are using Hive version (Apache Hive (version 3.1.0.3.1.0.0-78) and trying to implement Bloom filter in it. So basically I have created a managed table with table properties defined as:
'orc.bloom.filter.columns'='*******', 'orc.bloom.filter.fpp'='0.05', 'orc.stripe.size'='268435456',
and stored it as orc file. While checking the explain plan(running: explain select count(1) from the_table where <condition>) in the current Hive version, I couldn't see anything as "Bloom_Filter" in the Plan provided by the CBO. The table I'm querying data in has records.
I have a few doubts:
- Is Hive 3.1 version not using Bloom filter? If so, I have queried a normal table with same query and condition have seen that it takes more time compared to a table having Bloom filter defined on the column that has condition.
- Is there any parameter that needs to be set to get the value/ Bloom filter in the table?
- I have come across three parameters, please let me know what does these signify :
hive.tez.max.bloom.filter.entries,hive.tez.min.bloom.filter.entries,hive.tez.bloom.filter.factor
Please let me know if anyone has used Bloom filter. Let me know then the process