Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Cannot Reproduce
-
Impala 2.5.0
-
None
Description
When running high concurrency workload I noticed that query planning is taking unproportionally longer than for single user.
Upon profiling the Planner I noticed that lots of threads are contending on a lock in Log4j.
By default Log4j synchronizes writes which via a lock which causes this contention.
With default Impala configuration the planner and other components write lots of info/warn messages to the log which causes heavy contention under concurrency.
By changing "Verbose Log Level" to 0 and "Logging Threshold" to fatal concurrent queries used in the test got a 50% speedup.
Planner timeline with default settings
Planner Timeline: 51s007ms
- Analysis finished: 6s020ms (6s020ms)
- Equivalence classes computed: 6s224ms (203.960ms)
- Single node plan created: 39s137ms (32s912ms)
- Runtime filters computed: 39s138ms (1.798ms)
- Distributed plan created: 50s659ms (11s520ms)
- Lineage info computed: 50s660ms (1.157ms)
- Planning finished: 51s007ms (346.339ms)
Planner timeline with logging disable
Planner Timeline: 161.667ms
- Analysis finished: 26.854ms (26.854ms)
- Equivalence classes computed: 27.760ms (906.310us)
- Single node plan created: 145.224ms (117.463ms)
- Runtime filters computed: 146.017ms (792.772us)
- Distributed plan created: 148.244ms (2.227ms)
- Lineage info computed: 149.043ms (798.857us)
- Planning finished: 161.667ms (12.623ms)
Stack Trace | Count | Duration |
---|---|---|
org.apache.log4j.Category.callAppenders(LoggingEvent) | 91,612 | 32,628,691,909,737 |
org.apache.log4j.Category.forcedLog(String, Priority, Object, Throwable) | 91,612 | 32,628,691,909,737 |
org.apache.log4j.Category.log(String, Priority, Object, Throwable) | 91,612 | 32,628,691,909,737 |
org.slf4j.impl.Log4jLoggerAdapter.debug(String) | 66,573 | 23,520,171,003,106 |
com.cloudera.impala.planner.HdfsScanNode.computeStats(Analyzer) | 16,545 | 5,762,818,131,400 |
com.cloudera.impala.planner.DistributedPlanner.createHashJoinFragment(HashJoinNode, PlanFragment, PlanFragment, long, ArrayList) | 15,689 | 5,615,955,980,208 |
com.cloudera.impala.planner.JoinNode.computeStats(Analyzer) | 12,864 | 4,716,775,262,309 |
com.cloudera.impala.planner.HdfsScanNode.computeNumNodes(Analyzer, long) | 5,528 | 1,888,005,994,277 |
com.cloudera.impala.planner.SingleNodePlanner.createJoinPlan(Analyzer, TableRef, List, List) | 5,212 | 1,818,139,839,257 |
com.cloudera.impala.analysis.SelectStmt.analyze(Analyzer) | 2,754 | 953,068,475,018 |
com.cloudera.impala.analysis.SelectStmt.analyzeAggregation(Analyzer) | 2,755 | 950,232,280,428 |
com.cloudera.impala.analysis.AggregateInfo.create(ArrayList, ArrayList, TupleDescriptor, Analyzer) | 2,745 | 945,034,393,921 |
com.cloudera.impala.service.Frontend.createExecRequest(TQueryCtx, StringBuilder) | 651 | 224,074,644,208 |
com.cloudera.impala.planner.Planner.createPlan() | 449 | 155,274,962,910 |
com.cloudera.impala.planner.Planner.computeResourceReqs(List, boolean, TQueryExecRequest) | 419 | 154,737,940,541 |
com.cloudera.impala.planner.DistributedPlanner.createPlanFragments(PlanNode) | 408 | 146,583,876,650 |
com.cloudera.impala.service.Frontend.analyzeStmt(TQueryCtx) | 312 | 101,646,742,787 |
com.cloudera.impala.service.JniFrontend.createExecRequest(byte[]) | 211 | 76,250,810,597 |
com.cloudera.impala.planner.UnionNode.computeStats(Analyzer) | 31 | 11,571,668,595 |
org.slf4j.impl.Log4jLoggerAdapter.warn(String) | 23,955 | 8,755,149,728,646 |
com.cloudera.impala.planner.PlanNode.multiplyCardinalities(long, long) | 23,955 | 8,755,149,728,646 |
org.slf4j.impl.Log4jLoggerAdapter.info(String, Object[]) | 1,084 | 353,371,177,985 |
Possible workaround
https://logging.apache.org/log4j/2.x/manual/async.html
JMC profile attached.
Attachments
Attachments
Issue Links
- is related to
-
IMPALA-6446 Enable asynchronous logging
- Open