On Mon, Mar 17, 2008, Robert Collins wrote:
> This reminds me, one of the things I was thinking heavily about a few
> years ago was locality of reference in N-CPU situations. That is, making
> sure we don't cause thrashing unnecessarily. For instance - given
> chunking we can't really avoid seeing all the bytes for a MISS, so does
> it matter if process all the request on one CPU, or part on one part on
> another? Given NUMA it clearly does matter, but how many folk run
> squid/want to run squid on a NUMA machines?
Everything you buy now is effectively NUMA-like.
> Or, should we make acl lookups come back to the same cpu, but do all the
> acl lookups on one cpu, trading potential locking (a non-read-blocking
> cache can allow result lookups cheaply) for running the same acl code
> over extended sets of acls. (Not quite SIMD, but think about the problem
> from that angle for a bit).
Thats what I'd like to benchmark. More importantly, what will work out better?
Run-to-completion like -2 and -3 are now, or run-parts-of-jobs-as-batch-processes?
(eg, run all pending ACL lookups -now- and queue whatever results can be;
run all network IO -now- and queue whatever results can be; run header parsing
-now- and queue whatever results can be.)
Sufficiently well written code can be run both ways - its just a question of
queuing and scheduling.
Adrian
-- - Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support - - $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -Received on Sun Mar 16 2008 - 20:51:37 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Apr 01 2008 - 13:00:10 MDT