My latest blog post on the development of Heather Wurtele’s swim times has generated some interest. Rather than explaining how I calculated the Swim Ratings in comments or private emails, I’ve decided to write a longer, technical blog post explaining the algorithm I’m using. Here are the steps to process new results:
- Calculate the race adjustment
- Adjust the individual times
- Calculate the new rating
Each of these steps is explained in more detail in the following sections. As an example, I’m using Heather’s results from IM Lake Placid 2011 in order not to overwhelm you with lots and lots of results. (Lake Placid had 24 Pros, her later races had way more.) The description also applies to my calculations for any of the legs in a triathlon or the total time, but as the question was specific to swim results, I’ll use the swim times as an example.
Calculate the race adjustment
The goal of the race adjustment is to figure out if the race was slow or fast, taking things into account like how accurate the course was measured or how conducive the conditions on race day were for fast times. In order to calculate this number, let’s have a look at the actual results (Pros, both men and women) first:
The next step is adding in the existing swim rating. (Some athletes haven’t got a swim rating yet, these can’t be used for the adjustment calculation.) Then I can calculate the difference between the rating and the actual swim time and calculate the difference in percent of the rating. Here’s the data after resorting the table based on the percentage:
Only two athletes were able to beat their swim rating (compared to 16 that took longer), so you can already see that the swim was “slow”. There are a few statistical tricks to come up with a “fair” overall adjustment, such as using the average (-4,23%) or the median (-4,33%). What I’ve found works best is to use a percentage of athletes closest to the median. This way larger variations than what we have in this data set (aka. “explosions” on the run) do not play such a big role. Here I end up with a swim adjustment of -4,51%.
Adjust the individual times
Once we have calculated the race adjustment, we can apply this adjustment to the individual times:
Basically we have removed all course and condition factors from the time and have arrived at a “neutral” swim time that is comparable between races held on different courses and in different years.
Calculate the new rating
Now that we have calculated an adjusted swim time for each athlete’s results, we can pull all of these individual results into a swim rating. To continue with the example, here are Heather’s swim results and adjusted swim times up to Lake Placid 2011:
(We can also see that the results Heather based her original assessment on – St. George 2010 – was by far her best swim result.)
The simplest solution is to just take the average of all the results. But then an old result has the same influence as a new result – which doesn’t help much in assessing the current capabilities of a developing athlete or an athlete way past his prime. Therefore, I’m assigning each result a weight based on how old the result is – the older the result, the lower the weight is. I’ve found a value of 0.75 per year works well at reflecting current capabilities without making the ratings change too much. For Heather’s results, the difference between an average and my method is small (54:21 vs. 54:22) but there are examples where the difference is meaningful.
I hope that I was able to explain in detail how I came up with the swim numbers that form the basis of my blog post comparing the different swim results. The calculation itself is pretty complicated and takes a lot of factors and situations into account. This has the disadvantage of making it almost impossible to calculate the numbers by hand, but so far I have not seen a better system. I accept that these numbers might not be “true” and cannot reflect the assessment of an athlete by a trainer who sees the athlete much more often than the few times per year an athlete can race in an Ironman. But a race is where “the rubber meets the road” and where an athlete has to show what all the hard training has been worth. The numbers just indicate whether there was an improvement or not and cannot judge the reasons behind it. Also, I can’t assess the future improvements of an athlete or the quality of a training program. I certainly wish Heather some improvements in her swim time (and overall results), and I’m sure she is busy planning with Paulo on how to improve.