Uploaded image for project: 'TinkerPop'
  1. TinkerPop
  2. TINKERPOP-1801

OLAP profile() step return incorrect timing

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.3.0, 3.2.6
    • 3.2.7, 3.3.1
    • hadoop
    • None

    Description

      Graph ProfileStep calculates time of next()/hasNext() calls, expecting recursion.
      But Message passing/RDD joins is used by GraphComputer.
      So next() does not recursively call next steps, but message is generated. And most of the time is taken by message passing (RDD join).
      Thus on graph computer the time between ProfileStep should be measured, not inside it.

      The other approach is to get Spark statistics with SparkListener and add spark stages timings into profiler metrics. that will work only for spark but will give better representation of step costs.
      The simple fix is measuring time between OLAP iterations and add it to the profiler step.
      This will not take into account computer setup time, but will be precise enough for long running queries.

      To reproduce:
      tinkerPop 3.2.6 gremlin:

      plugin activated: tinkerpop.server
      plugin activated: tinkerpop.utilities
      plugin activated: tinkerpop.spark
      plugin activated: tinkerpop.tinkergraph
      gremlin> graph = GraphFactory.open('conf/hadoop/hadoop-grateful-gryo.properties')
      gremlin> g = graph.traversal().withComputer(SparkGraphComputer)
      ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat], sparkgraphcomputer]
      gremlin> g.V().out().out().count().profile()
      ==>Traversal Metrics
      Step                                                               Count  Traversers       Time (ms)    % Dur
      =============================================================================================================
      GraphStep(vertex,[])                                                 808         808           2.025    18.35
      VertexStep(OUT,vertex)                                              8049         562           4.430    40.14
      VertexStep(OUT,edge)                                              327370        7551           4.581    41.50
      CountGlobalStep                                                        1           1           0.001     0.01
                                                  >TOTAL                     -           -          11.038        -
      gremlin> clock(1){g.V().out().out().count().next() }
      ==>3421.92758
      gremlin>
      

      Attachments

        Activity

          People

            okram Marko A. Rodriguez
            artem.aliev Artem Aliev
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: