On 18/08/11 22:50, Chen Bangzhong wrote:
> thanks you Amos and Drunkard.
>
> My website hosts novels, That's, user can read novel there.
>
> The pages are not truely static contents, so I can only cache them for
> 10 minutes.
>
> My squids serve both non-cachable requests (works like nginx) and
> cachable-requests (10 min cache). So 60% cache miss is reasonable. It
> is not a good design, but we can't do more now.
Oh well. Good luck wishes on that side of the problem.
>
> Another point is, only hot novels are read by users. Crawlers/robots
> will push many objects to cache. These objects are rarely read by user
> and will expire after 10 minutes.
>
> If the http response header indicates it is not cachable(eg:
> max-age=0), will squid save the response in RAM or disk? My guess is
> squid will discard the response.
Correct. It will discard the response AND anything it has already cached
for that URL.
For non-hot objects this will not be a major problem. But may raise disk
I/O a bit as the existing old stored content gets kicked out. Which
might actually be a good thing, emptying space in the cache early. Or
wasted I/O. It's not clear exactly which.
>
> If the http response header indicates it is cachable(eg: max-age=600),
> squid will save it in the cache_mem. If the object is larger than
> maximum_object_size_in_memory, it will be written to disk.
Yes.
>
> Can you tell me when will squid save the object to disk? When will
> squid delete the staled objects?
Stale objects are deleted at the point they are detected as stale and no
longer usable (ie a request has been made for it and updated replacement
has arrived from the web server). Or if they are the oldest object
stored and more cache space is needed for newer objects.
Other than tuning your existing setup there are two things I think you
may be interested in.
The first is a Measurement Factory project which involves altering Squid
to completely bypass the cache storage when an object can't be cached or
re-used by other clients. Makes them faster to process, and avoids
dropping cached objects to make room. Combining this with a "cache deny"
rule identifying those annoying robots as non-cacheable would allow you
to store only the real users traffic needs.
This is a slightly longer-term project, AFAIK it is not ready for
production use (might be wrong). At minimum TMF are possibly needing
sponsorship assistance to progress it faster. Contact Alex Rousskov
about possibilities there, http://www.measurement-factory.com/contact.html
The second thing is an alternative squid configuration which would
emulate that behaviour immediately using two Squid instances.
Basically; configure a new second instance as a non-caching gateway
which all requests go to first. That could pass the robots and other
easily detected non-cacheable requests straight to the web servers for
service. While passing the other potentially cacheable requests to your
current Squid instance, where storage and cache fetches happen more
often without the robots.
The gateway squid would have a much smaller footprint since it needs
no memory for caching or indexing, and no disk usage at all.
Amos
-- Please be using Current Stable Squid 2.7.STABLE9 or 3.1.14 Beta testers wanted for 3.2.0.10Received on Thu Aug 18 2011 - 13:02:41 MDT
This archive was generated by hypermail 2.2.0 : Thu Aug 18 2011 - 12:00:04 MDT