Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-5222

java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Cannot Reproduce
    • 4.14.0
    • None
    • None

    Description

      I am running my Spark code to read data from Phoenix which has Spark 2.3.0 installed. Running in IntelliJ, it works not fine that is throwing me this error:
      java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame

      spark code:

      package com.ahct.hbase
      
      import org.apache.spark.sql._
      
      object Test1 {
        def main(args: Array[String]): Unit = {
      
          val zkUrl = "192.168.240.101:2181"
          val spark = SparkSession.builder()
            .appName("SparkPhoenixTest1")
            .master("local[2]")
            .getOrCreate()
      
          val df = spark.read.format("org.apache.phoenix.spark")
            .option("zkurl", zkUrl)
            .option("table","\"bigdata\".\"tbs1\"")
            .load()
      
          df.show()
        }
      }
      

      My pom.xml which is correctly mentioning Spark version as 2.3.0:

      
          <properties>
              <maven.compiler.source>1.8</maven.compiler.source>
              <maven.compiler.target>1.8</maven.compiler.target>
              <encoding>UTF-8</encoding>
              <scala.version>2.11.8</scala.version>
              <spark.version>2.3.0</spark.version>
              <hadoop.version>2.6.0-cdh5.14.2</hadoop.version>
              <hive.version>1.1.0-cdh5.14.2</hive.version>
          </properties>
      
          <dependencies>
              <dependency>
                  <groupId>org.apache.hadoop</groupId>
                  <artifactId>hadoop-client</artifactId>
                  <version>${hadoop.version}</version>
              </dependency>
      
              <dependency>
                  <groupId>org.apache.spark</groupId>
                  <artifactId>spark-sql_2.11</artifactId>
                  <version>${spark.version}</version>
              </dependency>
      
              <dependency>
                  <groupId>org.scala-lang</groupId>
                  <artifactId>scala-library</artifactId>
                  <version>${scala.version}</version>
              </dependency>
      
              <dependency>
                  <groupId>org.apache.hive</groupId>
                  <artifactId>hive-exec</artifactId>
                  <version>${hive.version}</version>
              </dependency>
      
              <dependency>
                  <groupId>org.apache.hive</groupId>
                  <artifactId>hive-jdbc</artifactId>
                  <version>${hive.version}</version>
              </dependency>
      
              <dependency>
                  <groupId>org.postgresql</groupId>
                  <artifactId>postgresql</artifactId>
                  <version>42.2.5</version>
              </dependency>
      
              <dependency>
                  <groupId>org.apache.phoenix</groupId>
                  <artifactId>phoenix-spark</artifactId>
                  <version>4.14.0-cdh5.14.2</version>
              </dependency>
      
              <dependency>
                  <groupId>org.apache.twill</groupId>
                  <artifactId>twill-api</artifactId>
                  <version>0.8.0</version>
              </dependency>
      
              <dependency>
                  <groupId>joda-time</groupId>
                  <artifactId>joda-time</artifactId>
                  <version>2.9.9</version>
              </dependency>
      
              <!-- Test -->
              <dependency>
                 <groupId>junit</groupId>
                 <artifactId>junit</artifactId>
                 <version>4.8.1</version>
                 <scope>test</scope>
              </dependency>
      
          </dependencies>
      

      Here is the stacktrace from IntelliJ which shows this error:

      "C:\Program Files\Java\jdk1.8.0_111\bin\java" "-javaagent:D:\Program Files\JetBrains\IntelliJ IDEA Community Edition 2017.3.5\lib\idea_rt.jar=61050:D:\Program Files\JetBrains\IntelliJ IDEA Community Edition 2017.3.5\bin" -Dfile.encoding=UTF-8 -classpath C:\Users\ZX~1\AppData\Local\Temp\classpath.jar com.ahct.hbase.Test3
      Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
      19/03/31 19:59:49 INFO SparkContext: Running Spark version 2.3.0
      19/03/31 19:59:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      19/03/31 19:59:50 INFO SparkContext: Submitted application: SparkPhoenixTest3
      19/03/31 19:59:50 INFO SecurityManager: Changing view acls to: ZX
      19/03/31 19:59:51 INFO SecurityManager: Changing modify acls to: ZX
      19/03/31 19:59:51 INFO SecurityManager: Changing view acls groups to: 
      19/03/31 19:59:51 INFO SecurityManager: Changing modify acls groups to: 
      19/03/31 19:59:51 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(ZX); groups with view permissions: Set(); users  with modify permissions: Set(ZX); groups with modify permissions: Set()
      19/03/31 19:59:53 INFO Utils: Successfully started service 'sparkDriver' on port 61072.
      19/03/31 19:59:53 INFO SparkEnv: Registering MapOutputTracker
      19/03/31 19:59:53 INFO SparkEnv: Registering BlockManagerMaster
      19/03/31 19:59:53 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
      19/03/31 19:59:53 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
      19/03/31 19:59:53 INFO DiskBlockManager: Created local directory at C:\Users\ZX\AppData\Local\Temp\blockmgr-7386bf6d-b0f4-40b0-b015-ed0191990e1c
      19/03/31 19:59:53 INFO MemoryStore: MemoryStore started with capacity 899.7 MB
      19/03/31 19:59:53 INFO SparkEnv: Registering OutputCommitCoordinator
      19/03/31 19:59:54 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
      19/03/31 19:59:54 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
      19/03/31 19:59:54 INFO Utils: Successfully started service 'SparkUI' on port 4042.
      19/03/31 19:59:54 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://DESKTOP-7M1BH3H:4042
      19/03/31 19:59:54 INFO Executor: Starting executor ID driver on host localhost
      19/03/31 19:59:54 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 61085.
      19/03/31 19:59:54 INFO NettyBlockTransferService: Server created on DESKTOP-7M1BH3H:61085
      19/03/31 19:59:54 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
      19/03/31 19:59:54 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, DESKTOP-7M1BH3H, 61085, None)
      19/03/31 19:59:54 INFO BlockManagerMasterEndpoint: Registering block manager DESKTOP-7M1BH3H:61085 with 899.7 MB RAM, BlockManagerId(driver, DESKTOP-7M1BH3H, 61085, None)
      19/03/31 19:59:54 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, DESKTOP-7M1BH3H, 61085, None)
      19/03/31 19:59:54 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, DESKTOP-7M1BH3H, 61085, None)
      19/03/31 19:59:55 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/D:/ahty/AHCT/code/scala-test/spark-warehouse/').
      19/03/31 19:59:55 INFO SharedState: Warehouse path is 'file:/D:/ahty/AHCT/code/scala-test/spark-warehouse/'.
      19/03/31 19:59:56 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
      19/03/31 19:59:58 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 419.0 KB, free 899.3 MB)
      19/03/31 19:59:59 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 29.4 KB, free 899.3 MB)
      19/03/31 19:59:59 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on DESKTOP-7M1BH3H:61085 (size: 29.4 KB, free: 899.7 MB)
      19/03/31 19:59:59 INFO SparkContext: Created broadcast 0 from newAPIHadoopRDD at PhoenixRDD.scala:49
      19/03/31 19:59:59 INFO deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
      19/03/31 19:59:59 INFO deprecation: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
      19/03/31 19:59:59 INFO QueryLoggerDisruptor: Starting  QueryLoggerDisruptor for with ringbufferSize=8192, waitStrategy=BlockingWaitStrategy, exceptionHandler=org.apache.phoenix.log.QueryLoggerDefaultExceptionHandler@7b5cc918...
      19/03/31 19:59:59 INFO ConnectionQueryServicesImpl: An instance of ConnectionQueryServices was created.
      19/03/31 20:00:00 INFO RecoverableZooKeeper: Process identifier=hconnection-0x1cbc5693 connecting to ZooKeeper ensemble=192.168.240.101:2181
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:zookeeper.version=3.4.5-cdh5.14.2--1, built on 03/27/2018 20:39 GMT
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:host.name=DESKTOP-7M1BH3H
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.version=1.8.0_111
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.vendor=Oracle Corporation
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.home=C:\Program Files\Java\jdk1.8.0_111\jre
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.class.path=C:\Users\ZX~1\AppData\Local\Temp\classpath.jar;D:\Program Files\JetBrains\IntelliJ IDEA Community Edition 2017.3.5\lib\idea_rt.jar
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.library.path=C:\Program Files\Java\jdk1.8.0_111\bin;C:\WINDOWS\Sun\Java\bin;C:\WINDOWS\system32;C:\WINDOWS;C:\Python27\;C:\Python27\Scripts;C:\Program Files (x86)\Intel\iCLS Client\;C:\ProgramData\Oracle\Java\javapath;D:\app\ZX\product\11.2.0\client_1\bin;C:\Program Files (x86)\RSA SecurID Token Common;C:\Program Files\Intel\iCLS Client\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;c:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\VSShell\Common7\IDE\;c:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\;c:\Program Files (x86)\Microsoft SQL Server\100\DTS\Binn\;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;D:\Program Files\Git\cmd;C:\Program Files\Mercurial\;D:\Go\bin;C:\TDM-GCC-64\bin;D:\Program Files (x86)\scala\bin;D:\python;D:\python\Scripts;C:\WINDOWS\System32\OpenSSH\;c:\program files\Mozilla Firefox;D:\Program Files\wkhtmltox\bin;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files\Intel\Intel(R) Management Engine Components\IPT;D:\Program Files\nodejs\;C:\ProgramData\chocolatey\bin;D:\code\ahswww\play5;D:\code\mysql-5.7.24\bin;C:\VisualSVN Server\bin;C:\Users\ZX\AppData\Local\Microsoft\WindowsApps;C:\Program Files (x86)\SSH Communications Security\SSH Secure Shell;C:\Users\ZX\AppData\Local\GitHubDesktop\bin;C:\Users\ZX\AppData\Local\Microsoft\WindowsApps;;D:\Program Files\Microsoft VS Code\bin;C:\Users\ZX\AppData\Roaming\npm;C:\Program Files\JetBrains\PyCharm 2018.3.1\bin;;.
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.io.tmpdir=C:\Users\ZX~1\AppData\Local\Temp\
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:java.compiler=<NA>
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:os.name=Windows 10
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:os.arch=amd64
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:os.version=10.0
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:user.name=ZX
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:user.home=C:\Users\ZX
      19/03/31 20:00:00 INFO ZooKeeper: Client environment:user.dir=D:\ahty\AHCT\code\scala-test
      19/03/31 20:00:00 INFO ZooKeeper: Initiating client connection, connectString=192.168.240.101:2181 sessionTimeout=90000 watcher=hconnection-0x1cbc56930x0, quorum=192.168.240.101:2181, baseZNode=/hbase
      19/03/31 20:00:00 INFO ClientCnxn: Opening socket connection to server hadoop001.local/192.168.240.101:2181. Will not attempt to authenticate using SASL (unknown error)
      19/03/31 20:00:00 INFO ClientCnxn: Socket connection established, initiating session, client: /192.168.240.101:61089, server: hadoop001.local/192.168.240.101:2181
      19/03/31 20:00:00 INFO ClientCnxn: Session establishment complete on server hadoop001.local/192.168.240.101:2181, sessionid = 0x169cc35e45e0013, negotiated timeout = 40000
      19/03/31 20:00:01 INFO ConnectionQueryServicesImpl: HConnection established. Stacktrace for informational purposes: hconnection-0x1cbc5693 java.lang.Thread.getStackTrace(Thread.java:1556)
      org.apache.phoenix.util.LogUtil.getCallerStackTrace(LogUtil.java:55)
      org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:427)
      org.apache.phoenix.query.ConnectionQueryServicesImpl.access$400(ConnectionQueryServicesImpl.java:267)
      org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2515)
      org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2491)
      org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
      org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2491)
      org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255)
      org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
      org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
      java.sql.DriverManager.getConnection(DriverManager.java:664)
      java.sql.DriverManager.getConnection(DriverManager.java:208)
      org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:113)
      org.apache.phoenix.mapreduce.util.ConnectionUtil.getInputConnection(ConnectionUtil.java:58)
      org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil.getSelectColumnMetadataList(PhoenixConfigurationUtil.java:354)
      org.apache.phoenix.spark.PhoenixRDD.toDataFrame(PhoenixRDD.scala:118)
      org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelation.scala:60)
      org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:431)
      org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
      org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
      org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
      com.ahct.hbase.Test3$.main(Test3.scala:25)
      com.ahct.hbase.Test3.main(Test3.scala)
      
      19/03/31 20:00:02 INFO deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
      Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
      	at java.lang.Class.getDeclaredMethods0(Native Method)
      	at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
      	at java.lang.Class.getDeclaredMethod(Class.java:2128)
      	at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475)
      	at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72)
      	at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498)
      	at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
      	at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
      	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134)
      	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
      	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
      	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
      	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
      	at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
      	at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43)
      	at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
      	at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:342)
      	at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:335)
      	at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:159)
      	at org.apache.spark.SparkContext.clean(SparkContext.scala:2292)
      	at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:371)
      	at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:370)
      	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
      	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
      	at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
      	at org.apache.spark.rdd.RDD.map(RDD.scala:370)
      	at org.apache.phoenix.spark.PhoenixRDD.toDataFrame(PhoenixRDD.scala:131)
      	at org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelation.scala:60)
      	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:431)
      	at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
      	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
      	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
      	at com.ahct.hbase.Test3$.main(Test3.scala:25)
      	at com.ahct.hbase.Test3.main(Test3.scala)
      Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.DataFrame
      	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
      	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
      	... 36 more
      19/03/31 20:00:08 INFO SparkContext: Invoking stop() from shutdown hook
      19/03/31 20:00:08 INFO SparkUI: Stopped Spark web UI at http://DESKTOP-7M1BH3H:4042
      19/03/31 20:00:08 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
      19/03/31 20:00:08 INFO MemoryStore: MemoryStore cleared
      19/03/31 20:00:08 INFO BlockManager: BlockManager stopped
      19/03/31 20:00:08 INFO BlockManagerMaster: BlockManagerMaster stopped
      19/03/31 20:00:08 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
      19/03/31 20:00:08 INFO SparkContext: Successfully stopped SparkContext
      19/03/31 20:00:08 INFO ShutdownHookManager: Shutdown hook called
      
      Process finished with exit code 1
      
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            unknowspeople unknowspeople
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 2,160h
                2,160h
                Remaining:
                Remaining Estimate - 2,160h
                2,160h
                Logged:
                Time Spent - Not Specified
                Not Specified