Better dev estimates for projects with fixed budget/time/scope
Free calculator for improving your project estimates, based on the historical estimates and actual hours. Useful for one-off projects in digital bureaus.
Projects with fixed budget and time are not a frequent occurrence these days anymore, but they are still quite a thing in digital bureaus who sometimes do fixed scope projects. While this sheet could also be used for ongoing projects, in my opinion, this has the highest value for the fixed-scope projects. In this article I am sharing a DIY Google Sheet calculation inspired by Joel Spolky’s article on Evidence-based scheduling (yep, it’s an oldie :).
Hofstadter’s Law: It always takes longer than you expect, even when you take Hofstadter’s Law into account.
Agile methodologies resolve the problem of running over the budget, but not every project can be and should be agile.
Running over budget is associated with huge bloated products and is frowned upon now that SCRUM has become mainstream.
While it’s easy to frown upon working in a waterfall way, it still has time and place these days. Having worked at an agency I encountered a number of clear-cut, ‘waterfall-ish’ projects for which you have a fixed time+budget+scope. This happens when a client has a relatively simple requirements set and needs a website/tool to ‘be there and work well’, without world-changing ambitions that would require heavy continuous development. Much can be said about pros and cons of such an approach, but this is not the goal of this post.
What is Evidence-based scheduling?
To get a proper idea of this concept, read the original blog post from 2007 (yeah, an oldie). In short:
- all developers have a bias when estimating tasks (some are too optimistic, others are too pessimistic).
- Statistically speaking, with enough estimates compared to the actual hours it is possible to see a pattern. It is not about how good or bad the estimations are of any individual, but how consistent we are with over- or underestimating the work.
- When you compare the past estimates and plot that against the estimated future work, you should be able to get a probability graph of when you should be able to complete the project.
Using the sheet
Make a copy and add it to your Google Drive. Now you can edit it.
How does this sheet work?
Based on the ideas described above, and my basic comprehension of statistics, I made a Google Sheet calculator for this. This is in no way near an actual algorithm that FogBugz seems to have developed, but it does give a clue on how the project might play out. Here’s how it works:
Tab 1: ‘Historical’
- Fill in the previous estimates and the actual time it took to complete the tickets. If the estimates are done by the whole team, skip the column 1 (‘Developer’), if they aren’t, make sure to include proportional amount of example data for every developer.
- We calculate the ‘velocity’, in other words, the relationship between the estimate and the actual hours. We will use this velocity for the simulation on the next tab. There are a couple of calculations on the right side, they too are necessary for the simulation.
- The scatter plot show the relation between the estimate and the actual time it took to complete the ticket. The perfect one will have a trend line sloped at 45 degrees.
Tab 2: ‘Simulation’
- On this page we let the Google Sheets produce 1000 simulations of the time it would take to complete this work based on the average velocity and the overall variability of the velocities in the historical data. If there are outliers in it (say, a task that was estimated at 10 hours/points but took 25), then some of the simulations will be quite extreme and skew the calculation. The more historical data you add, the less influence such outlier would have on the resulting estimate. This simulation is called ‘Monte Carlo simulation’.
- Fill in the amount of developer hours/points per day, amount of days in the sprint and the amount of hours/points that are estimated to be done in the project of interest.
- When you click, type, the simulations will re-calculate, but the overall pattern should stay relatively the same.
How to interpret the result
In the green section you should see the summary of the simulation. ‘95% confidence’ means that in 95% of the simulations the estimate will not be higher than the result given here.
It is entirely possible that the actual hours will be anywhere on the curve, but we are looking for a realistic maximum here so that we can give safe estimates.
Help improve this
My knowledge of statistics and Excel/Google Sheets is limited, and I would love to improve this sheet to have more functions and statistical validity. Let me know in the comments if you have any tips or suggestions!