On Thu, 2007-07-26 at 09:31 -0600, Alex Rousskov wrote:
> Folks,
>
> There are at least two known difficult-to-reproduce bugs that may be
> explained by asynchronous calls being called out of order. I do not know
> whether out-of-order execution is indeed their cause, but they prompted
> me to investigate event scheduling further.
>
> Currently, asynchronous calls are implemented using addEvent with
> 'when' parameter set to zero. This means that the event time is set to
> current_dtime in EventScheduler::schedule. However, current_dtime may
> _decrease_ when the system clock is adjusted. If such a decrease happens
> between the two asynchronous call submissions, the later call will be
> fired first.
>
> I see two ways of fixing this:
>
> 1) Stop using addEvent for asynchronous calls. Add a special queue for
> them and drain the queue every select loop. Pros: straightforward design
> that is probably a little faster than addEvent because we will always
> append the new call instead of searching for the right place in the
> queue. This design will help treating asynchronous calls specially in
> the future (e.g., debugging and exception trapping). Cons: lots of work
> and current code changes.
>
> 2) Treat when=0 events specially in addEvent. Always place them in the
> beginning of the queue, but after other special events. To mark an event
> as special, we can set its absolute timestamp to zero, for example.
> Pros: much easier to implement. Cons: it is a hack.
>
> I am leaning towards (2) for now because it minimizes the modifications
> and risk. The attached patch implements that option.
>
> Any comments or better ideas?
I think 1) would be good, but your hack for 2) looks eminently sane.
-Rob
-- GPG key available at: <http://www.robertcollins.net/keys.txt>.
This archive was generated by hypermail pre-2.1.9 : Wed Aug 01 2007 - 12:00:06 MDT