Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-5692

StackOverflowError during SolrCloud leader election process

    XMLWordPrintableJSON

Details

    Description

      I have SolrCloud cluster with 7 nodes, each with few 1000 cores. I got this StackOverflow few times when starting one of the nodes (just a piece of stack trace, the rest repeats, leader election process obviously got stuck in infinite repetition of steps):

      [2/4/14 3:42:43 PM] Bojan: 2014-02-04 15:18:01,947 [localhost-startStop-1-EventThread] ERROR org.apache.zookeeper.ClientCnxn- Error while calling watcher
      java.lang.StackOverflowError
      at java.security.AccessController.doPrivileged(Native Method)
      at java.io.PrintWriter.<init>(PrintWriter.java:116)
      at java.io.PrintWriter.<init>(PrintWriter.java:100)
      at org.apache.solr.common.SolrException.toStr(SolrException.java:138)
      at org.apache.solr.common.SolrException.log(SolrException.java:113)
      [2/4/14 3:42:58 PM] Bojan: at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:377)
      at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
      at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
      at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
      at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272)
      at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380)
      at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
      at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
      at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
      at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272)
      at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380)
      at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
      at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
      at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
      at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272)
      at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380)
      at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
      at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
      at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
      at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272)
      at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380)
      at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
      at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
      at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
      at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272)

      Attachments

        1. recovery-stackoverflow.txt
          42 kB
          Jessica Cheng Mallet

        Issue Links

          Activity

            People

              Unassigned Unassigned
              bosmid Bojan Smid
              Votes:
              3 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: