Time Remaining…
on 09.10.04, 01:52pm in software • comments (1)
I’m sitting here watching a fairly large file copy (2-3GB) over the network to my laptop, and wondering how the ‘Minutes Remaining’ time is actually calculated. Every time I look over, it has some crazy new amount of time left - 235 minutes, then 6 minutes, then 133 minutes, now 82 minutes.
I also often find the ‘Percentage Remaining’ in setup programs even worse - it’ll jump to 100% pretty quickly and then sit there for 2-3 minutes.
Oops, 34 minutes left.
Update: Raymond Chen provides some explaination. Thanks ‘bilbo’ for pointing that out
Update 2: Copy finally completed. Dropped from 106 minutes to 45 seconds, then done.




Michael Brundage (September 13, 2004 @ 12:19 am)
Actually, there’s more to it than Raymond suggests. There’s a whole theory out there (queueing theory) that tries to predict when something will happen based on past observations. Unfortunately, software developers don’t use it.
Most time remaining estimators out there just use the naive algorithm of (work done so far) / (time required so far) = (avg time so far), and then use that calculate the time remaining = ((total work required) - (work done so far)) / (average time so far). This is what Windows does, and the discussion thread on Raymond’s site focuses on the problems with the part that estimates the (total work required).
However even if they were to improve that part, the time remaining estimate would still suck. To provide a better estimate, Windows would need to use a more sophisticated equation (like a Poisson distribution) and keep track of recent samples (rather than just an overall average). Such an algorithm not only provides a more reliable estimate of time remaining, but also handles brief bursts of fast or slow activity without screwing up the overall estimate.
You can easily tell whether software uses the naive approach or something smarter by watching the effect of bursts on the time remaining for a long-running task. If a brief slowdown or speedup causes the estimate to fluctuate wildly (even though the task has been running for awhile), then you know the developer used a naive estimator.