[MESOS-9543] Consider improving benchmark result output. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Accepted
Priority: Minor
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: test
Labels:
- integration

Epic Link:
test-harness

Description

We should consider improving how benchmarks report their results.

As an example, consider SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1. It logs lines like

[==========] Running 10 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 10 tests from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test
[ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/0
Using 1000 agents and 1 frameworks
Added 1 frameworks in 526091ns
Added 1000 agents in 61.116343ms
round 0 allocate() took 14.70722ms to make 0 offers after filtering 1000 offers
round 1 allocate() took 15.055396ms to make 0 offers after filtering 1000 offers
[       OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/0 (135 ms)
[ RUN      ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1

I believe there are a number of usability issues with this output format

lines with benchmark data need to be grep'd from the test log depending on some test-dependent format
test parameters need to be manually inferred from the test name
no consistent time unit is used throughout, but instead Duration values are just pretty printed

This makes it hard to consume this results in a generic way (e.g., for plotting, comparison, etc.) as to do that one likely needs to implement a custom log parser (for each test).

We should consider introducing a generic way to log results from tests which requires minimal intervention.

One possible output format could be JSON as it allows to combine heterogeneous data like in above example (which might be harder to do in CSV). There exists a number of standard tools which can be used to filter JSON data; it can also be read by many data analysis tools (e.g., pandas). Example for above data:

{
    "case": "SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test",
    "test": "DeclineOffers/0",
    "parameters": [1000, 1],
    "benchmarks": {
        "add_agents": [61.116343],
        "add_frameworks": [0.0526091],
        "allocate": [
            {"round": 0, "time": 14.70722, "offers": 0, "filtering": 1000},
            {"round": 1, "time": 15.055396, "offers": 0, "filtering": 1000}
        ]
    }
}

Such data could be logged at the end of the test execution with a clear prefix to allow aggregating data from many benchmark results in a single log file with tools like grep. We could provide that in addition to what is already logged (which might be generated by the same tool).

Attachments

Issue Links

is related to

MESOS-4559 Run benchmark tests in ASF CI

Open

Activity

People

Assignee:: Unassigned

Reporter:: Benjamin Bannier

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 30/Jan/19 21:13

Updated:: 29/Apr/19 09:27