[SPARK-31368] The query with the where condition failed,when the partition field is null - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Incomplete
Affects Version/s: 2.4.4, 2.4.5
Fix Version/s: None
Component/s: SQL
Labels:
- bulk-closed
Environment:

Hide

1、Linux environment: CentOS Linux release 7.3.1611 or CentOS Linux release 7.5.1804

2、Spark Client environment: Spark-2.4.4-bin-hadoop2.6 or Spark-2.4.5-bin-hadoop2.6

3、Hadoop environment: hadoop-2.6.0-cdh5.8.4

4、Hive environment: hive-1.1.0-cdh5.8.4

5、Java environment: jdk1.8.0_181

6、Python environment: python 2.7.5

Show
1、Linux environment: CentOS Linux release 7.3.1611 or CentOS Linux release 7.5.1804 2、Spark Client environment: Spark-2.4.4-bin-hadoop2.6 or Spark-2.4.5-bin-hadoop2.6 3、Hadoop environment: hadoop-2.6.0-cdh5.8.4 4、Hive environment: hive-1.1.0-cdh5.8.4 5、Java environment: jdk1.8.0_181 6、Python environment: python 2.7.5

Description

The problem recurs as follows：

create table test_1(id int,name string) partitioned by(profile string)
insert into test_1 values(1,null)
select * from test_1 where profile is null

Go through the above steps，the result is nothing.But if add the condition profile='_HIVE_DEFAULT_PARTITION_',the result is OK.

The temporary solution:

select * from test_1 where profile is null or profile='_HIVE_DEFAULT_PARTITION_'

The result is OK

Special instructions：

1、The above phenomenon，Only the partition filed type is string can happen

2、The above operation in hive is no problem

Problem orientation：

As far as I'm consider the problem is in org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils and org.apache.spark.sql.catalyst.catalog.CatalogTablePartition.Especially the toRow function in CatalogTablePartition.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: tanweihua

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 07/Apr/20 03:38

Updated:: 25/May/21 01:50

Resolved:: 25/May/21 01:44