Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-31933

Found another deadlock in Spark Driver

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.4.6, 3.1.0
    • None
    • Spark Submit
    • None

    Description

      This is another Spark java-level deadlock we found: it looks similar to https://issues.apache.org/jira/browse/SPARK-26961, but I think actually they are different issues.

       Looks like this deadlock is caused from FsUrlStreamHandlerFactory.

      One straightforward way to fix this is to change FsUrlStreamHandlerFactory.java#L74 from FileSystem.getFileSystemClass(protocol, conf); to FileSystem.getFileSystemClass(protocol, new Configuration(conf));

      But not sure if this is acceptable from Hadoop side. Not sure if there's better way to solve this from Spark side.

       

      "SparkUI-60":  waiting to lock monitor 0x00007f511ca22728 (object 0x000000068c6e9060, a org.apache.hadoop.conf.Configuration),  which is held by "Driver"
      
      "Driver":  waiting to lock monitor 0x00007f511c4fe448 (object 0x0000000400079600, a java.util.HashMap),  which is held by "SparkUI-60"
      

       

      "SparkUI-60":
      at org.apache.hadoop.conf.Configuration.getOverlay(Configuration.java:1328) 
      - waiting to lock <0x000000068c6e9060> (a org.apache.hadoop.conf.Configuration) 
      at org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:684) 
      at org.apache.hadoop.conf.Configuration.get(Configuration.java:1088) 
      at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145) 
      at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363) 
      at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840) 
      at org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74) 
      at java.net.URL.getURLStreamHandler(URL.java:1165) 
      at java.net.URL.<init>(URL.java:422) 
      at java.net.URL.<init>(URL.java:312) 
      at java.net.URL.<init>(URL.java:335) 
      at sun.net.www.ParseUtil.fileToEncodedURL(ParseUtil.java:272) 
      at java.lang.Package$1.run(Package.java:579) 
      at java.lang.Package$1.run(Package.java:570) 
      at java.security.AccessController.doPrivileged(Native Method) 
      at java.lang.Package.defineSystemPackage(Package.java:570) 
      at java.lang.Package.getSystemPackage(Package.java:546) 
      - locked <0x0000000400079600> (a java.util.HashMap) 
      at java.lang.ClassLoader.getPackage(ClassLoader.java:1630) 
      at java.lang.ClassLoader.getPackage(ClassLoader.java:1628) 
      at java.net.URLClassLoader.getAndVerifyPackage(URLClassLoader.java:394) 
      at java.net.URLClassLoader.definePackageInternal(URLClassLoader.java:420) 
      at java.net.URLClassLoader.defineClass(URLClassLoader.java:452) 
      at java.net.URLClassLoader.access$100(URLClassLoader.java:74) 
      at java.net.URLClassLoader$1.run(URLClassLoader.java:369) 
      at java.net.URLClassLoader$1.run(URLClassLoader.java:363) 
      at java.security.AccessController.doPrivileged(Native Method) 
      at java.net.URLClassLoader.findClass(URLClassLoader.java:362) 
      at java.lang.ClassLoader.loadClass(ClassLoader.java:419) 
      - locked <0x000000068dd20910> (a java.lang.Object) 
      at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) 
      at java.lang.ClassLoader.loadClass(ClassLoader.java:352) 
      at org.glassfish.jersey.model.internal.ComponentBag.modelFor(ComponentBag.java:483) 
      at org.glassfish.jersey.model.internal.ComponentBag.access$100(ComponentBag.java:89) 
      at org.glassfish.jersey.model.internal.ComponentBag$5.call(ComponentBag.java:408) 
      at org.glassfish.jersey.model.internal.ComponentBag$5.call(ComponentBag.java:398) 
      at org.glassfish.jersey.internal.Errors.process(Errors.java:315) 
      at org.glassfish.jersey.internal.Errors.process(Errors.java:297) 
      at org.glassfish.jersey.internal.Errors.process(Errors.java:228) 
      at org.glassfish.jersey.model.internal.ComponentBag.registerModel(ComponentBag.java:398) at org.glassfish.jersey.model.internal.ComponentBag.register(ComponentBag.java:235) 
      at org.glassfish.jersey.model.internal.CommonConfig.register(CommonConfig.java:420) 
      at org.glassfish.jersey.server.ResourceConfig.register(ResourceConfig.java:425) 
      at org.glassfish.jersey.server.ResourceConfig.registerClasses(ResourceConfig.java:501) 
      at org.glassfish.jersey.server.ResourceConfig$RuntimeConfig.<init>(ResourceConfig.java:1212) 
      at org.glassfish.jersey.server.ResourceConfig$RuntimeConfig.<init>(ResourceConfig.java:1178) 
      at org.glassfish.jersey.server.ResourceConfig.createRuntimeConfig(ResourceConfig.java:1174) at org.glassfish.jersey.server.ApplicationHandler.<init>(ApplicationHandler.java:345) 
      at org.glassfish.jersey.servlet.WebComponent.<init>(WebComponent.java:392) 
      at org.glassfish.jersey.servlet.ServletContainer.init(ServletContainer.java:177) 
      at org.glassfish.jersey.servlet.ServletContainer.init(ServletContainer.java:369) 
      at javax.servlet.GenericServlet.init(GenericServlet.java:244) 
      at org.spark_project.jetty.servlet.ServletHolder.initServlet(ServletHolder.java:643) 
      at org.spark_project.jetty.servlet.ServletHolder.getServlet(ServletHolder.java:499) 
      - locked <0x0000000407a5a840> (a org.spark_project.jetty.servlet.ServletHolder) 
      at org.spark_project.jetty.servlet.ServletHolder.ensureInstance(ServletHolder.java:791) 
      - locked <0x0000000407a5a840> (a org.spark_project.jetty.servlet.ServletHolder) 
      at org.spark_project.jetty.servlet.ServletHolder.prepare(ServletHolder.java:776) 
      at org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:579) 
      at org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) at org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) 
      at org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) at org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) 
      at org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493) 
      at org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) 
      at org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) 
      at org.spark_project.jetty.server.Server.handle(Server.java:539) 
      at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:333) at org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) 
      at org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283) 
      at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108) at org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) 
      at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) 
      at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) 
      at org.spark_project.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) 
      at org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) 
      at org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) at java.lang.Thread.run(Thread.java:748)
      
      
      "Driver": 
      at java.lang.Package.getSystemPackage(Package.java:540) - waiting to lock <0x0000000400079600> (a java.util.HashMap) 
      at java.lang.ClassLoader.getPackage(ClassLoader.java:1630) 
      at java.net.URLClassLoader.getAndVerifyPackage(URLClassLoader.java:394) 
      at java.net.URLClassLoader.definePackageInternal(URLClassLoader.java:420) 
      at java.net.URLClassLoader.defineClass(URLClassLoader.java:452) 
      at java.net.URLClassLoader.access$100(URLClassLoader.java:74) 
      at java.net.URLClassLoader$1.run(URLClassLoader.java:369) 
      at java.net.URLClassLoader$1.run(URLClassLoader.java:363) 
      at java.security.AccessController.doPrivileged(Native Method) 
      at java.net.URLClassLoader.findClass(URLClassLoader.java:362) 
      at java.lang.ClassLoader.loadClass(ClassLoader.java:419) 
      - locked <0x000000068d792da0> (a org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1) 
      at org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1.doLoadClass(IsolatedClientLoader.scala:226) 
      at org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1.loadClass(IsolatedClientLoader.scala:215) 
      at java.lang.ClassLoader.loadClass(ClassLoader.java:406) 
      - locked <0x000000068e0ac798> (a java.lang.Object) 
      at java.lang.ClassLoader.loadClass(ClassLoader.java:352) 
      at java.lang.Class.forName0(Native Method) 
      at java.lang.Class.forName(Class.java:348) 
      at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:370) 
      at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) 
      at java.util.ServiceLoader$1.next(ServiceLoader.java:480) 
      at javax.xml.parsers.FactoryFinder$1.run(FactoryFinder.java:294) 
      at java.security.AccessController.doPrivileged(Native Method) 
      at javax.xml.parsers.FactoryFinder.findServiceProvider(FactoryFinder.java:289) 
      at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:267) 
      at javax.xml.parsers.DocumentBuilderFactory.newInstance(DocumentBuilderFactory.java:120) 
      at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2720) 
      at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2696) 
      at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2579) 
      - locked <0x000000068c6e9060> (a org.apache.hadoop.conf.Configuration) 
      at org.apache.hadoop.conf.Configuration.get(Configuration.java:1091) 
      at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145) 
      at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363) 
      at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840) 
      at org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74) 
      at java.net.URL.getURLStreamHandler(URL.java:1165) 
      at java.net.URL.<init>(URL.java:617) 
      at java.net.URL.<init>(URL.java:508) 
      at java.net.URL.<init>(URL.java:457) 
      at java.net.JarURLConnection.parseSpecs(JarURLConnection.java:175) 
      at java.net.JarURLConnection.<init>(JarURLConnection.java:158) 
      at sun.net.www.protocol.jar.JarURLConnection.<init>(JarURLConnection.java:81) 
      at sun.net.www.protocol.jar.Handler.openConnection(Handler.java:41) 
      at java.net.URL.openConnection(URL.java:1002) 
      at java.net.URLClassLoader.getResourceAsStream(URLClassLoader.java:238) 
      at org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1.doLoadClass(IsolatedClientLoader.scala:221) 
      at org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1.loadClass(IsolatedClientLoader.scala:215) 
      at java.lang.ClassLoader.loadClass(ClassLoader.java:406) 
      - locked <0x000000068d7f5ff8> (a java.lang.Object) 
      at java.lang.ClassLoader.loadClass(ClassLoader.java:352) 
      at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:269) - locked <0x000000068d76eb58> (a org.apache.spark.sql.hive.client.IsolatedClientLoader) 
      at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:384) 
      at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:286) 
      at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66) 
      - locked <0x000000068d3f43a8> (a org.apache.spark.sql.hive.HiveExternalCatalog) 
      at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65) 
      at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:215) 
      at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) 
      at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:215) 
      at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) 
      - locked <0x000000068d3f43a8> (a org.apache.spark.sql.hive.HiveExternalCatalog) 
      at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:214) 
      at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114) 
      - locked <0x00000004058b3d10> (a org.apache.spark.sql.internal.SharedState) 
      at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102) 
      at org.apache.spark.sql.internal.SharedState.globalTempViewManager$lzycompute(SharedState.scala:141) - locked <0x00000004058b3d10> (a org.apache.spark.sql.internal.SharedState) 
      at org.apache.spark.sql.internal.SharedState.globalTempViewManager(SharedState.scala:136) 
      at org.apache.spark.sql.hive.HiveSessionStateBuilder$$anonfun$2.apply(HiveSessionStateBuilder.scala:55) 
      at org.apache.spark.sql.hive.HiveSessionStateBuilder$$anonfun$2.apply(HiveSessionStateBuilder.scala:55) 
      at org.apache.spark.sql.catalyst.catalog.SessionCatalog.globalTempViewManager$lzycompute(SessionCatalog.scala:91) 
      - locked <0x000000068baa3c70> (a org.apache.spark.sql.hive.HiveSessionCatalog) 
      at org.apache.spark.sql.catalyst.catalog.SessionCatalog.globalTempViewManager(SessionCatalog.scala:91) 
      at org.apache.spark.sql.catalyst.catalog.SessionCatalog.isTemporaryTable(SessionCatalog.scala:736) - locked <0x000000068baa3c70> (a org.apache.spark.sql.hive.HiveSessionCatalog) 
      at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.isRunningDirectlyOnFiles(Analyzer.scala:763) 
      at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.resolveRelation(Analyzer.scala:697) 
      at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$8.applyOrElse(Analyzer.scala:729) 
      at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$8.applyOrElse(Analyzer.scala:722) 
      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1$$anonfun$apply$1.apply(AnalysisHelper.scala:90) 
      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1$$anonfun$apply$1.apply(AnalysisHelper.scala:90) 
      at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70) 
      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1.apply(AnalysisHelper.scala:89) 
      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1.apply(AnalysisHelper.scala:86) 
      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:194) 
      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperatorsUp(AnalysisHelper.scala:86) 
      at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUp(LogicalPlan.scala:29) 
      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1$$anonfun$1.apply(AnalysisHelper.scala:87) 
      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1$$anonfun$1.apply(AnalysisHelper.scala:87) 
      at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:329) 
      at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187) 
      at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:327) 
      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1.apply(AnalysisHelper.scala:87) 
      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsUp$1.apply(AnalysisHelper.scala:86) 
      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:194) 
      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperatorsUp(AnalysisHelper.scala:86) 
      at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUp(LogicalPlan.scala:29) 
      at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:722) 
      at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:668) 
      at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:90) 
      at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:87) 
      at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124) 
      at scala.collection.immutable.List.foldLeft(List.scala:84) 
      at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:87) 
      at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:79) 
      at scala.collection.immutable.List.foreach(List.scala:392) 
      at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:79) 
      at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:143) 
      at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$execute$1.apply(Analyzer.scala:135) 
      at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$execute$1.apply(Analyzer.scala:135) 
      at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withLocalMetrics(Analyzer.scala:96) 
      at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:134) 
      at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:118) 
      at org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:117) 
      at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:201) 
      at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:117) 
      at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:57) 
      - locked <0x000000068b9e84a0> (a org.apache.spark.sql.execution.QueryExecution) 
      at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:55) 
      at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:47) 
      at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:78) 
      at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:643) 
      at com.nielsen.mdl.ingestor.merge.MergeExecutor.executor(MergeExecutor.scala:43) 
      at com.nielsen.mdl.ingestor.merge.MergeProcessor$.main(MergeProcessor.scala:42) 
      at com.nielsen.mdl.ingestor.merge.MergeProcessor.main(MergeProcessor.scala) 
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
      at java.lang.reflect.Method.invoke(Method.java:498) 
      at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:684)
      

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              wenningd Wenning Ding
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: