How programs are measured
Each program is run and measured at the smallest input value, program output redirected to a file and compared to expected output. As long as the output matches expected output, the program is then run and measured at the next larger input value until measurements have been made at every input value.
If the program gives the expected output within an arbitrary cutoff time (120 seconds) the program is measured again (5 more times) with output redirected to
If the program doesn't give the expected output within an arbitrary timeout (usually one hour) the program is forced to quit. If measurements at a smaller input value have been successful within an arbitrary cutoff time (120 seconds), the program is measured again (5 more times) at that smaller input value, with output redirected to
The measurements shown on the website are either:
For sure, programs taking 4 and 5 hours were only measured once!
How programs are timed
Each program is run as a child-process of a Python script using
secs - The time is taken before forking the child-process and after the child-process exits, using
cpu - The script child-process
usr+sys rusage time is taken using
os.wait3. Rarely (for example OCaml), that may not measure all processes forked from the script child-process.
busy - The GTop cpu idle and GTop cpu total are taken before forking the child-process and after the child-process exits. The sum of GTop cpu not-idle for each core, scaled by secs.
(Note: Those measurements include startup time).
How program memory use is measured
GLIBTOP_PROC_MEM_RESIDENT for the program and it's child processes every 0.2 seconds. Obviously those measurements are unlikely to be reliable for programs that run for less than 0.2 seconds.
How source code size is measured
We start with the source-code markup you can see, remove comments, remove duplicate whitespace characters, and then apply minimum GZip compression. The measurement is the size in bytes of that GZip compressed source-code file.
Thanks to Brian Hurt for the idea of using size of compressed source code instead of lines of code.
|median source code gzip (July 2018)
(Note: There is some evidence that complexity metrics don't provide any more information than SLoC or LoC.)
How CPU load is measured
The GTop cpu idle and GTop cpu total are taken before forking the child-process and after the child-process exits. The percentages represent the proportion of cpu not-idle to cpu total for each core.
IdleTime are taken before forking the child-process and after the child-process exits. The percentage represents the proportion of
UserTime + IdleTime (because that's like the percentage you'll see in Task Manager).