Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
1.17.1
-
None
-
None
Description
Summary:
Fail to read flink parquet filesystem table stored in hive metastore service.
The problem:
When I try to read a flink parquet filesystem table stored in hive metastore service, I got the following exception.
java.lang.RuntimeException: One or more fetchers have encountered exception at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherManager.checkErrors(SplitFetcherManager.java:261) ~[flink-connector-files-1.17.1.jar:1.17.1] at org.apache.flink.connector.base.source.reader.SourceReaderBase.getNextFetch(SourceReaderBase.java:169) ~[flink-connector-files-1.17.1.jar:1.17.1] at org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:131) ~[flink-connector-files-1.17.1.jar:1.17.1] at org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:417) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:68) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:550) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:931) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745) ~[flink-dist-1.17.1.jar:1.17.1] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) ~[flink-dist-1.17.1.jar:1.17.1] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_345] Caused by: java.lang.NoSuchMethodError: shaded.parquet.org.apache.thrift.TBaseHelper.hashCode(J)I at org.apache.parquet.format.ColumnChunk.hashCode(ColumnChunk.java:812) ~[flink-sql-parquet-1.17.1.jar:1.17.1] at java.util.AbstractList.hashCode(AbstractList.java:541) ~[?:1.8.0_345] at org.apache.parquet.format.RowGroup.hashCode(RowGroup.java:704) ~[flink-sql-parquet-1.17.1.jar:1.17.1] at java.util.HashMap.hash(HashMap.java:340) ~[?:1.8.0_345] at java.util.HashMap.put(HashMap.java:613) ~[?:1.8.0_345] at org.apache.parquet.format.converter.ParquetMetadataConverter.generateRowGroupOffsets(ParquetMetadataConverter.java:1411) ~[flink-sql-parquet-1.17.1.jar:1.17.1] at org.apache.parquet.format.converter.ParquetMetadataConverter.access$600(ParquetMetadataConverter.java:144) ~[flink-sql-parquet-1.17.1.jar:1.17.1] at org.apache.parquet.format.converter.ParquetMetadataConverter$3.visit(ParquetMetadataConverter.java:1461) ~[flink-sql-parquet-1.17.1.jar:1.17.1] at org.apache.parquet.format.converter.ParquetMetadataConverter$3.visit(ParquetMetadataConverter.java:1437) ~[flink-sql-parquet-1.17.1.jar:1.17.1] at org.apache.parquet.format.converter.ParquetMetadataConverter$RangeMetadataFilter.accept(ParquetMetadataConverter.java:1207) ~[flink-sql-parquet-1.17.1.jar:1.17.1] at org.apache.parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:1437) ~[flink-sql-parquet-1.17.1.jar:1.17.1] at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:583) ~[flink-sql-parquet-1.17.1.jar:1.17.1] at org.apache.parquet.hadoop.ParquetFileReader.<init>(ParquetFileReader.java:777) ~[flink-sql-parquet-1.17.1.jar:1.17.1] at org.apache.parquet.hadoop.ParquetFileReader.open(ParquetFileReader.java:658) ~[flink-sql-parquet-1.17.1.jar:1.17.1] at org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createReader(ParquetVectorizedInputFormat.java:127) ~[flink-sql-parquet-1.17.1.jar:1.17.1] at org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.createReader(ParquetVectorizedInputFormat.java:75) ~[flink-sql-parquet-1.17.1.jar:1.17.1] at org.apache.flink.connector.file.table.FileInfoExtractorBulkFormat.createReader(FileInfoExtractorBulkFormat.java:109) ~[flink-connector-files-1.17.1.jar:1.17.1] at org.apache.flink.connector.file.src.impl.FileSourceSplitReader.checkSplitOrStartNext(FileSourceSplitReader.java:112) ~[flink-connector-files-1.17.1.jar:1.17.1] at org.apache.flink.connector.file.src.impl.FileSourceSplitReader.fetch(FileSourceSplitReader.java:65) ~[flink-connector-files-1.17.1.jar:1.17.1] at org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58) ~[flink-connector-files-1.17.1.jar:1.17.1] at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:162) ~[flink-connector-files-1.17.1.jar:1.17.1] at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:114) ~[flink-connector-files-1.17.1.jar:1.17.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_345] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_345] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_345] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_345] ... 1 more
Possible reason:
When I start the cluster with the "-verbse:class" opt, I got classloading message shown below.
# how I start the cluster FLINK_ENV_JAVA_OPTS='-verbose:class' bin/start-cluster.sh
[Loaded shaded.parquet.org.apache.thrift.TBaseHelper from file:/Users/guozhenyang/Tools/flink-1.17.1/lib/flink-sql-connector-hive-3.1.3_2.12-1.17.1.jar] [Loaded org.apache.parquet.format.ColumnChunk from file:/Users/guozhenyang/Tools/flink-1.17.1/lib/flink-sql-parquet-1.17.1.jar]
I assume there maybe conflict between the libthrift libs contained in flink-sql-connector-hive-3.1.3_2.12-1.17.1.jar and flink-sql-parquet-1.17.1.jar.