Re: scalable event scheduling from Rainer Weikusat on 2013-01-30 (squid-dev)

From: Rainer Weikusat <rweikusat_at_mobileactivedefense.com>
Date: Wed, 30 Jan 2013 13:37:22 +0000

Amos Jeffries <squid3_at_treenet.co.nz> writes:
> On 30/01/2013 10:31 a.m., Rainer Weikusat wrote:
>> Rainer Weikusat <rweikusat_at_mobileactivedefense.com> writes:
>>
>> [...]
>>
>>> There are probably all kinds of 'weird, stylistic issues' and
>>> possible, yet unknown bugs,
>> Attached is a slightly updated version of this. Mostly, it fixes an
>> accidentally inverted comparison in down_heap, uses the squid x*
>> memory allocation wrapper routines I found meanwhile and is more
>> careful to avoid possibly invoking external code while the heap is not
>> in a consistent state. It has still only been used on development
>> computers.
>
> Couple of things stand out...
>
> * if tag is always a class ev_tag, why is it being cast to/from void*
> all over the place? there is almost no need for void* to exist in
> C++. Just pre-define class ev_tag; and use "ev_tag &" parameter types
> (as const wherever possible).

The idea behind this is that nothing outside of the event scheduler
code itself should be concerned with what a tag precisely happens to
be right now: It is supposed to be 'some data value' produced as side
effect of scheduling an event and its sole property (except the need
to release it) is that it can be used to cancel a scheduled event. In
other implementations, I've usually used the address of the 'event
object' itself, masked by presenting it to the caller as void *. This
is not possible in squid because the event scheduler doesn't run
'event routines' synchronously but just schedules them for execution
via call queue at some unspecified later time.

The only way to achieve this 'with classes in C++' (I'm aware of)
would require using "an idiom I won't name" (because my sense of
language asthetics revolts against the 'well-known' acronym), ie,
using something like the incomplete example below:

class the_secret_ingredient;

class ev_tag {
private:
the_secret_ingredient *something;
}

or using pointers to a 'class ev_tag'. But that's really just a lot
more boilerplate to express the same thing and it still 'exports'
information to the caller which should really be no business of it.

As far as my present understanding goes, the ev_entry object could
alternatively used the 'callback allocator' and CbcPointers could be
used instead of the tag objects. But I'm not sufficiently familiar
with this yet and I would like to minimise 'collateral changes'
(first, make it work within the existing environment, then, possibly,
make the environment nicer ...).

[...]

> * I know the code you are replacing uses C-style functions for each
> action. But please make things like eventCancel() members of the event
> object being cancelled. Particularly since you are already testing
> whether that object exists before cancelling, there is no extra cost
> of correcting the function scope.

I admit that I don't quite understand this. For this specific example,
the cancel routine can't be part of the event object (another 'inside
implementation detail', actually) because it might not exist anymore
by the time someone tries to cancel the event. It could be a member of
the tag class except that the existing code cancels events based on
the function and argument pointers and knows nothing about tags
(presenly, the sole user of these being an 'internal' version of squid
where the ConnOpener code has to get rid of scheduled connection
timeouts). Also, the actual cancel routine is a private method of the
EventScheduler object because it is really the scheduler which
performs the cancel operation[*].

[*] A more object-oriented implementation could do away with the
function-call based interface altogether: Code desiring to schedule an
event would create an 'event object' with certain
properties. Destroying this object before the event was scheduled
would cancel the event. Anything else would happen 'by magic'. That's
something I'm presently using in evironments where reference counting
comes 'for free' (aka perl). OTOH, some people have called this 'an
anti-pattern' and it would require extensive changes to the existing
code.

> * Also, can you get any performance numbers about the impact of the
> change please?

How should this be done? Considering that there are presently no
'heavy users' of scheduled events precisely because of performance
concerns, the actual effect can be predicted as 'negligible'/
'unmeasurably small'. Inserting an event at the back of the queue is
an O(1) operation (one assignment + one comparison). Inserting an
element at the front of the queue will require a comparison and an
assignment for each 'tree layer' above the current one. Since the heap
is actually a fully-balanced binary tree, there will be log n of these
layers, n being the largest power-of-two <= the number of events in
the queue (including the newly inserted one). Inserting somehing
'somewhere in the middle' will fall in between. Another
micro-benchmark could be written to demonstrate this but this wouldn't
be representative for the actual event scheduling patterns in the
proxy.

As another theoretic example, assuming an event with delay '4.5' is
supposed to be inserted and the queue presently consists of events with
delays 1, 2, 3, 4, 5, 6 and 7. Doing this with a linked list will
require 5 pointer assignments and comparisons in order to locate the
correct position in the list (between 4 and 5) and another two pointer
assignments to link the new event in there. Doing the same with a
heap, original form,

1 2 3 4 5 6 7

requires one assignment,

1 2 3 4 5 6 7 4.5

and a comparison with the element at position 4 (insert position(8) /
2). This element is 4, consequently, nothing more needs to be done
here.

> ... still reading the changes, so there may be more. But that seems
> big enough for now.

The clean method, in its revised form, is seriously buggy (won't free
anything if the heap is presently empty).
Received on Wed Jan 30 2013 - 13:37:36 MST

This archive was generated by hypermail 2.2.0 : Wed Jan 30 2013 - 12:00:08 MST