BenchExec wall-clock aka elapsed-seconds aka secs is recorded outside the program being measured and is independent of the program being measured.
Instead, timestamps can be recorded (in-process) inside the program being measured. Each program edited to include measurement code, which may be different for different programming languages.
Let's ascribe differences between the independent measurements and in-process measurements to startup and shutdown costs. For example, some n-body programs —
Once programs have been edited to include measurement code, successive calculations can be measured. (In practice we found that the fifth iteration … for default workload … exhibit well-warmed up behavior …
)
Let's ascribe differences between the 1st calculation and the 6th calculation to warmup. For example, some n-body programs —
The in-process measurements for those particular tiny tiny Java programs are tenths-of-a-second faster, than the BenchExec wall-clock measurements. The calculation takes seconds and tens-of-seconds.
Sometimes fast startup does matter.
… a major release of the DaCapo benchmark suite for Java that took fourteen years to develop…
Remember, the benchmarks game only shows a tiny number of tiny, tiny programs. So with humility: Let's perform 24 successive calculations and record System.nanoTime()
before and after each calculation (exclude startup and shutdown costs, and warmup).
The way the tiny programs were written does seem to matter:
fannkuch-redux
Successive calculations with the fannkuch-redux #8 program became slower; calculations with the fannkuch-redux #3 program were stable, and after the first calculation the fannkuch-redux #2 program became faster.
Some experimental studies show 10.9% of process executions don't reach a steady state of peak performance; and 43.5% of process executions were inconsistent; and sometimes they are slower than what came before
.
spectral-norm
Written for onecore:
:and written for multicore:
How much difference does it make for these tiny programs? otoh For measurements of tenths of a second, tenths of a second is a huge difference. otoh For measurements of seconds and tens-of-seconds, JVM startup, JIT, OSR… are quickly effective.