[SPARK-29409] spark drop partition always throws Exception - ASF JIRA

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Incomplete
Affects Version/s: 2.4.0
Fix Version/s: None
Component/s: SQL
Labels:
- bulk-closed
Environment:

spark 2.4.0 on yarn 2.7.3

spark-sql client mode

run hive version: 2.1.1

hive builtin version 1.2.1

Description

The table is:

CREATE TABLE `test_spark.test_drop_partition`(
 `platform` string,
 `product` string,
 `cnt` bigint)
PARTITIONED BY (dt string)
stored as orc;

hive 2.1.1:

spark-sql -e "alter table test_spark.test_drop_partition drop if exists partition(dt='2019-10-08')"

hive builtin:

spark-sql --conf spark.sql.hive.metastore.version=1.2.1 --conf spark.sql.hive.metastore.jars=builtin -e "alter table test_spark.test_drop_partition drop if exists partition(dt='2019-10-08')"

both would log Exception:

19/10/09 18:21:27 INFO metastore: Opened a connection to metastore, current connections: 1
19/10/09 18:21:27 INFO metastore: Connected to metastore.
19/10/09 18:21:27 WARN RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect.
org.apache.thrift.transport.TTransportException: Cannot write to null outputStream
 at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142)
 at org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:178)
 at org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:106)
 at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:70)
 at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
 at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_partitions_ps_with_auth(ThriftHiveMetastore.java:2433)
 at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions_ps_with_auth(ThriftHiveMetastore.java:2420)
 at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsWithAuthInfo(HiveMetaStoreClient.java:1199)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
 at com.sun.proxy.$Proxy30.listPartitionsWithAuthInfo(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2265)
 at com.sun.proxy.$Proxy30.listPartitionsWithAuthInfo(Unknown Source)
 at org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:2333)
 at org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:2359)
 at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1$$anonfun$16.apply(HiveClientImpl.scala:560)
 at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1$$anonfun$16.apply(HiveClientImpl.scala:555)
 at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
 at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
 at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
 at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
 at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
 at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1.apply$mcV$sp(HiveClientImpl.scala:555)
 at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1.apply(HiveClientImpl.scala:550)
 at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1.apply(HiveClientImpl.scala:550)
 at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275)
 at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213)
 at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212)
 at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258)
 at org.apache.spark.sql.hive.client.HiveClientImpl.dropPartitions(HiveClientImpl.scala:550)
 at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$dropPartitions$1.apply$mcV$sp(HiveExternalCatalog.scala:972)
 at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$dropPartitions$1.apply(HiveExternalCatalog.scala:970)
 at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$dropPartitions$1.apply(HiveExternalCatalog.scala:970)
 at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
 at org.apache.spark.sql.hive.HiveExternalCatalog.dropPartitions(HiveExternalCatalog.scala:970)
 at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.dropPartitions(ExternalCatalogWithListener.scala:203)
 at org.apache.spark.sql.catalyst.catalog.SessionCatalog.dropPartitions(SessionCatalog.scala:846)
 at org.apache.spark.sql.execution.command.AlterTableDropPartitionCommand.run(ddl.scala:546)
 at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
 at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
 at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195)
 at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3365)
 at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
 at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
 at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
 at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3364)
 at org.apache.spark.sql.Dataset.<init>(Dataset.scala:195)
 at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:80)
 at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
 at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
 at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62)
 at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:371)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
 at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:274)
 at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
 at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
 at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
 at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
 at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
 at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
 at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
 at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
19/10/09 18:21:28 INFO metastore: Trying to connect to metastore with URI thrift://127.0.0.1:9083
19/10/09 18:21:28 INFO metastore: Opened a connection to metastore, current connections: 2
19/10/09 18:21:28 INFO metastore: Connected to metastore.
Time taken: 3.715 seconds

But this sql is normal

spark-sql -e "select * from test_spark.test_drop_partition where dt='2019-10-08' limit 3; alter table test_spark.test_drop_partition drop if exists partition(dt='2019-10-08');"

Attachments

Issue Links

relates to

HIVE-24404 Hive getUserName close db makes client operations lost metaStoreClient connection

Closed

spark drop partition always throws Exception

Details

Description

Attachments

Issue Links

Activity

People

Dates