Hashed wheel timers – A short intro.
The Hashed wheel timer is a very interesting data structure used widely especially in network servers. Their low over memory over head and reasonable efficiency guarantees are a good match for servers handling millions of connections with a timer per connection. We will not spend too much time describing how they work, instead we will look at a few implementations and try to evaulate their relative trade-offs. More information about hashed-wheel-timers can be found here.
Let’s quickly recollect the basic data structures of a hashed wheel timer:
The basic data structure looks like a hashmap with separate chaining with lists. Each bucket represents a fixed number of ticks. When a timer is added to the wheel we calculate which bucket it should fall into. Though any hashing scheme would do, we usually just travel forward from the current time pointer one bucket at a time, till we have consumed enough ticks for our timeout to have just triggered. We then insert the timer into the list for that bucket. Given that we have a finite number of buckets, we might have to circle around the timer wheel a few times before we can insert our timer. We will need to store this information with the timer. This is typically stored as a remainingRounds
variable.
The first draft of an interface for out fictional timer is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Usually a timer is implemented with a more generic interface with some kind of a Runnable
task per timer that is executed upon timer expiry. We have made things a lot more specific and require that each timer has an unique id. We also leave the execution of code corresponding to each expired timer up to the user of the library. We could use some other identifier instead of an id. For eg: A pointer to a per connection structure if the timers are for closing stale connections.
So the absolute minimum data we need to have in our timers is:
1 2 3 4 5 6 |
|
Given this basic scheme let’s look at some of the ways we could implement the interface we showed earlier
1. Linked list of timers per bucket.
Each bucket simply points to an intrusive doubly linked list of timers. Our data structures look like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
|
Our algorithms look like
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
So our algorithm characteristics in brief are:
- To add a timer, just prepend it to the appropriate linked list –
O(1)
- To cancel a timer – delete it from the doubly linked list –
O(1)
- To clean up all expired timers – Travel list of timers and delete expired ones –
O(expired timers)
. The actual time depends on how many timers hashed onto a bucket. We can put some bounds on this number with general hash table math.
Comments on implementation.
This is our base line. We have significant memory over head in terms of previous/next pointers on our timers. It’s pretty terrible that we have to malloc()
and free()
on each add_timer()
and cancel_timer()
call. Of course the O(1)
calls also hide the fact that we are traversing a linked list during timer expiry which is pretty terrible.
2. Vector of timer pointers per bucket.
Instead of an intrusive linked list we can just store a vector of pointers to timers. We can grow this vector as needed. If memory is not a concern we could allocate a large enough array so that growing is never needed (though still accounted for). Our timer struct now looks like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
Our algorithms look like
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
|
So our algorithm characteristics in brief are:
- To add a timer, just append it to the appropriate vector –
O(1)
. - To cancel a timer, delete it from the middle of the vector and move a pointer to occupy the hole –
O(1)
. - To clean up all expired timers – Traverse the vector of timers and delete expired ones –
O(expired timers)
. Again actual time depends on how many timers hashed onto a bucket.
Comments on implementation.
This is a slight improvement on our base line. By using a vector of pointers to timers we have removed the need for the next pointers – a slight improvement in space usage. We still have the terrible pattern of having to use malloc()
and free()
on each add_timer()
and cancel_timer()
call. During expiry of timers we now traverse a vector of pointers to timers instead of a linked list. This is only marginally better than the first implementation since the timers themselves are allocated randomly.
3. Vector of inlined timers.
Let’s see if we can improve effiency by making our interface even more specialized. So far we have returned a Timer*
on an add_timer()
call. Our intented use for the Timer*
is to cancel the timer using this pointer. Thus we will end up stashing this pointer somewhere. Maybe on a per connection structure or a hash map or something. What if we could store a pointer to this timer holder structure (that points to our timer) in our timer? We could then safely move/invalidate the pointer to a timer, updating the only place where this timer was actually referenced. Here we assume that the pointer to the timer was stashed in only one place. This is somewhat like our own little garbage collector updating references to timer objects that were moved around. Let’s look at our new data structures:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
Our modified interface now is:
1 2 3 4 5 6 7 8 9 |
|
Our algorithms look like
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
|
So our algorithm characteristics in brief are:
- To add a timer, just scribble some data to the end of the appropriate vector –
O(1)
. - To cancel a timer, copy the last timer in the bucket to the position of the deleted one and adjust references to the moved timer –
O(1)
. - To clean up all expired timers – Traverse the vector of timers and delete expired ones using the
cancel_timer()
call –O(expired timers)
. Again actual time depends on how many timers hashed onto a bucket.
Comments on implementation.
We no longer call malloc()
and free()
on every timer call. Instead we manage large buckets and copy small amounts of data around to manage deletions. We ended up storing a pointer to the holder of the timer object but overall we are not storing any more data that the other implementations. When we got rid of the malloc()
and free()
calls we also got rid of any fragmentation arising from them. Further during expiry processing we iterate through a chunk of contiguous memory instead of pointer chasing.
Conclusion
By analyzing the context of our timer algorithm we were able to come up with a better implementation albeit a more tightly coupled one. Wholistic analysis of a particular piece of software often opens up such opportunities for efficiency gains. Though many a time there is a efficiency
VS other-non-tangibles
trade-off, I’ve been surprised by the number of times such gains have been made without tigher coupling.