Details
Description
Hello Technical Support team,
This is one of critical production issue we are facing on Spark version 1.5.1. It is throwing JAVA runtime exception error "apache.hadoop.fs.s3a.S3AFileSystem" not found. Although it works perfectly on Spark version 1.3.1. Is this known issue on Spark1.5.1? I have opened case with Cloudera CDH but they are not fully supporting this yet. We are using spark-shell (scala) now a day lot so end user would prefer this environment to execute there HQL and most of datasets exist at S3 bucket. Note that there is no complain if the dataset call from HDFS (Hadoop FS) so it seems to be related to my Spark configuration or something similar. Pls help to identify root cause and its solution. Following is the more technical info for review :
scala> val rdf1 = sqlContext.sql("Select * from ntcom.nc_currency_dim").collect()
rdf1: Array[org.apache.spark.sql.Row] = Array([-1,UNK,UNKNOWN,UNKNOWN,0.74,1.35,1.0,1.0,DBUDAL,11-JUN-2014 20:36:41,JHOSLE,2008-03-26 00:00:00.0,105.0,6.1,2014-06-11 20:36:41,2015-07-08 22:10:02,N], [-1,UNK,UNKNOWN,UNKNOWN,1.0,1.0,1.0,1.0,PDHAVA,08-JUL-2015 22:10:03,JHOSLE,2008-03-26 00:00:00.0,null,null,2015-07-08 22:10:03,3000-01-01 00:00:00,Y], [1,DKK,Danish Krone,Danish Krone,0.13,7.46,0.180965147453,5.53,DBUDAL,11-JUN-2014 20:36:41,NCBATCH,2007-01-16 00:00:00.0,19.0,1.1,2014-06-11 20:36:41,2015-07-08 22:10:02,N], [1,DKK,Danish Krone,Danish Krone,0.134048257372654,7.46,0.134048257372654,7.46,PDHAVA,08-JUL-2015 22:10:03,NCBATCH,2007-01-16 00:00:00.0,null,null,2015-07-08 22:10:03,3000-01-01 00:00:00,Y], [2,EUR,Euro,EMU currency (Euro),1.0,1.0,1.35,0.74,DBUDAL,11-JUN-2014 20:36:41,NCBA...
rdf1 = sqlContext.sql("Select * from dev_ntcom.nc_currency_dim").collect()
11:52 AM
ava.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2074)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2578)
at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$getTableOption$1$$anonfun$2.apply(ClientWrapper.scala:303)
at scala.Option.map(Option.scala:145)
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1980)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2072)
... 120 more
15/11/05 20:31:01 ERROR log: error in initSerDe: org.apache.hadoop.hive.serde2.SerDeException Encountered exception determining schema. Returning signal schema to indicate problem: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
org.apache.hadoop.hive.serde2.SerDeException: Encountered exception determining schema. Returning signal schema to indicate problem: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs
at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:524)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:391)