[TINKERPOP-1616] Strengthen semantics around lazy iteration and graph modifications - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Later
Affects Version/s: 3.2.3
Fix Version/s: None
Component/s: structure
Labels:
None

Description

The investigation started with the a bothE query where Sqlg returned different results to TinkerGraph

@Test
public void testLazy1() {
    final TinkerGraph graph = TinkerGraph.open();
    final Vertex a1 = graph.addVertex(T.label, "A");
    final Vertex b1 = graph.addVertex(T.label, "B");
    final Vertex c1 = graph.addVertex(T.label, "C");
    a1.addEdge("ab", b1);
    a1.addEdge("ac", c1);

    AtomicInteger count = new AtomicInteger(0);
    graph.traversal().V(a1).bothE().forEachRemaining(edge -> {
        a1.addEdge("ab", b1);
        c1.addEdge("ac", a1);
        count.getAndIncrement();
    });
    Assert.assertEquals(2, count.get());
}

For this query TinkerGraph returns 2 and passes.
Sqlg however returns 3. The reason being that it lazily iterates the out() first and then the in().

The following gremlin is the same but using a union(out(), in()) instead of bothE()

@Test
public void testLazy2() {
    final TinkerGraph graph = TinkerGraph.open();
    final Vertex a1 = graph.addVertex(T.label, "A");
    final Vertex b1 = graph.addVertex(T.label, "B");
    final Vertex c1 = graph.addVertex(T.label, "C");
    a1.addEdge("ab", b1);
    a1.addEdge("ac", c1);

    AtomicInteger count = new AtomicInteger(0);
    graph.traversal().V(a1).union(__.outE(), __.inE()).forEachRemaining(edge -> {
        a1.addEdge("ab", b1);
        c1.addEdge("ac", a1);
        count.getAndIncrement();
    });
    Assert.assertEquals(2, count.get());
}

In this case TinkerGraph returns 4 and Sqlg 6

TinkerGraph returns 4 as it first walks the 2 out edges and adds 2 in edges which it sees when traversing the in().

Sqlg return 6 as it lazier than TinkerGraph.
It first walks the "ac" out edge and adds in the 2 edges.
Then walks "ab" and gets 2 edges. The original and the one added previously.
It then walks "ac" in and gets 3 edges as 3 has been added so far.
All and all 6.

I am not sure whats the expected semantics. Sqlg is lazier than TinkerGraph but not completely lazy either as it depends more on the meta data and number of queries it needs to execute to walk a particular gremlin query.

I am somewhat of the opinion that without enforcing a eager iteration when modifying a graph the semantics will be different for different implementors.

For Sqlg at least it will be hard for clients to predict the behavior.

Attachments

Issue Links

is related to

TINKERPOP-1685 Introduce optional feature to allow for upserts without read-before-write

Closed

Activity

People

Assignee:: Stephen Mallette

Reporter:: pieter martin

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 26/Jan/17 17:46

Updated:: 09/Jul/19 11:48

Resolved:: 09/Jul/19 11:48