Andres Kroonmaa wrote:
> I expect FS meta hitrate be low - it is increasingly easy to build boxes
> that have disk/ram ratio of 200 and more (200G disk, 1G ram).
> There's been lots of talk about reqiserfs, which by all means sounds good,
> but I've not seen any mention of it being supported on any other OS but
> Linux or BSD. Even if we can implement it within squid, I see several
> cases that makes me worry about dependance on disk access for lookups.
Agree on most aspects, except the first.
Yes, a general purpose FS with general purpose tuning will have a way to
low hit rate, and this is partly experienced in the reiserfs-raw
approach (all-misses being a worse load to handle than all-hits), but
still I think they have proven the concept.
> ICP - shouldn't ever touch disks to return hit/miss. 2 boxes with equal
> load and average hitrate of 30% would see 3 times more ICP traffic than
> actual fetches from peers. We'd rather not allow this to touch disks.
Well, usually I do not consider ICP as an useable option, and thake the
freedom not to consider it's requirements when thinking about store
implementation. It should be possible to implement ICP with reasonable
performance if the normal cache operations can be implemented with good
performance.
> OK, use digests for ICP. But digest generation? We do it currently quite
> rarely, because it has quite some cpu overhead? ICP is wanted when instant
> knowledge is wanted between peers that may be part of loadsharing setup.
> Delay between digest generations is not acceptable there. Can we add
> objects to digest as they arrive? Where we keep it between reloads?
I also consider inter-cache digest exchanges as a dead end.. the
technology is not mature, and scales badly when the cache sizes
increases. However, I see a value of intra-cache digest exchanges, i.e.
inside a cluster of boxes making up a bigger cache, and in such
environment other criterias for the digest quality can be applied.
> Based on what do we generate a digest, at startup? First it sounds like we
> need to access/stat every object on disks to get the MD5 key for digest.
> ICP shouldn't return hits for stale objects, so object timestamps are
> needed during ICP request. refcounts and timestamps are also needed for
> the replacement policys to work.
With a design like the ReiserFS-raw/butterfly we cannot generate a
digest in a practical manner simply due to the fact that doing so would
require reading in each and every object in the cache.. What is possible
is to have the filesystem generate a digest-like hint map to speed up
availability lookups.
> long time ago Squid moved from diskbased (ala apache) lookups to rambased
> lookups, now it seems we are moving towards diskbased again,
Hmm.. Squid has never had a diskbased lookup, and I don't think Harvest
cached had either..
> although with much more efficient disk access, and to reduce ram usage.
> Still, disk performance hasn't changed much, and if we make squid
> performance dependant on disks too much, then I'm really worried.
Hopefully it is possible combine the best of both worlds. Use ram for
the things really needing high speed answers, and disk for most else.
In a normal Squid index only a very small fraction is ever needed. Of
the entries accessed, only very few fields are actually needed to be
available in a speedy manner.
> I'm worried, probably because I don't see how you solve them.
> But I'm curious to know.
My idea (which Alex seems to share) is that the in-core requirement of
metadata can most likely be compressed into a very compact hint map (ala
a cache digests) which tells us within a reasonably high probability if
the object is in the cache or not. Sort of like a (but vastly different)
inode bit map.
Please note that as it is a cache we are dealing with it is non-critical
if we sometimes happens to forget that we do have an object in the cache
as long as the space eventually gets reclaimed. The worst thing that can
happen is that the object has to be fetched from the source again. I'll
happily accept this if it allows for a more efficient design.
/Henrik
Received on Fri Nov 03 2000 - 15:19:00 MST
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:55 MST