[SPARK-9762] ALTER TABLE cannot find column - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Not A Problem
Affects Version/s: 1.4.1
Fix Version/s: None
Component/s: SQL
Labels:
None
Environment:

Ubuntu on AWS

Description

ALTER TABLE tbl CHANGE cannot find a column that DESCRIBE COLUMN lists.

In the case of a table generated with HiveContext.read.json(), the output of DESCRIBE dimension_components is:

comp_config	struct<adText:string,adTextLeft:string,background:string,brand:string,button_color:string,cta_side:string,cta_type:string,depth:string,fixed_under:string,light:string,mid_text:string,oneline:string,overhang:string,shine:string,style:string,style_secondary:string,style_small:string,type:string>
comp_criteria	string
comp_data_model	string
comp_dimensions	struct<data:string,integrations:array<string>,template:string,variation:bigint>
comp_disabled	boolean
comp_id	bigint
comp_path	string
comp_placementData	struct<mod:string>
comp_slot_types	array<string>

However, alter table dimension_components change comp_dimensions comp_dimensions struct<data:string,integrations:array<string>,template:string,variation:bigint,z:string>; fails with:

15/08/08 23:13:07 ERROR exec.DDLTask: org.apache.hadoop.hive.ql.metadata.HiveException: Invalid column reference comp_dimensions
	at org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3584)
	at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:312)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
	at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:345)
	at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:326)
	at org.apache.spark.sql.hive.client.ClientWrapper.withHiveState(ClientWrapper.scala:155)
	at org.apache.spark.sql.hive.client.ClientWrapper.runHive(ClientWrapper.scala:326)
	at org.apache.spark.sql.hive.client.ClientWrapper.runSqlHive(ClientWrapper.scala:316)
	at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:473)
...

Meanwhile, SHOW COLUMNS in dimension_components lists two columns: col (which does not exist in the table) and z, which was just added.

This suggests that DDL operations in Spark SQL use table metadata inconsistently.

Full spark-sql output here.

Attachments

Issue Links

is related to

SPARK-9764 Spark SQL uses table metadata inconsistently

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Simeon Simeonov

Votes:: 6 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 09/Aug/15 03:19

Updated:: 10/Oct/16 05:33

Resolved:: 10/Oct/16 05:33