The Problem:
With positive_dns_ttl set to zero squid exhibits a behavior which
is confusing for multi-server sites with inconsistent failure modes.
The behavior with positive_dns_ttl set to non zero is much worse so I
will not even go into that (see the ps below).
My Exploration:
In looking for an example of this problem I stumbled on this site.
www.geocities.com
Consider the following four addresses
www.geocities.com. 299 A 192.216.191.91
91.191.216.192.in-addr.arpa. 43200 PTR www2.geocities.com.
www.geocities.com. 299 A 192.216.191.92
92.191.216.192.in-addr.arpa. 43200 PTR www3.geocities.com.
www.geocities.com. 299 A 192.216.191.90
90.191.216.192.in-addr.arpa. 43200 PTR www4.geocities.com.
www.geocities.com. 299 A 192.216.191.81
81.191.216.192.in-addr.arpa. 43200 PTR www.geocities.com.
History of manual telnet to the IP and port 80 at Tue Aug 20
23:51:21 MDT 1996. Response confirmed with "GET /"
Host www2 refuses connection
host www3 hangs on connect (syn packet times out on retries)
host www4 connects and responds
host www connects and responds
Connecting from a browser through the squid proxy the behavior is erratic
but predictable. Since positive_dns_ttl = 0 there is no IP caching in the
server. So normal DNS round robin rules are followed.
On first attempt DNS returns www2 and the site fails mediately.
On second attempt DNS returns www3 then the site times out after 94
seconds. (black hole)
On third attempt DNS returns www4. The base HTML is fetched and squid
starts GETing the img references. These each follow a time line like
the above. First GET succeeds (www), next fails with a broken image (www2)
third GET hangs, fourth succeeds, fifth succeeds and so on.
My Recommendation:
While the current behavior is to be expected given the ipcache
implementation. I suspect that a better behavior would be easy
to implement. I'd like to see squid keep information about the its
successful connection attempts. The little discussion below details
my thinking on these lines.
1) Forget about caching DNS results in the server. Named
does a fine job of it and the async technique used by the
dnsservers work well to access a local (or remote) IP
address cache. Further, the current ip cache implementation
is broken.
2) Cache _failed_ ip addresses for each HTTP attempt. Set
a TTL for these cache entries at something like the TTL
for the DNS record itself.
3) Check the failed ip cache before making a connection attempt.
4) Try ALL THE ADDRESSES returned by DNS before returning a failure
to the client.
Conclusion:
I suspect that these problems are a historical part of the way
that squid's ip chache system has worked since the early harvest
days. The brokenness was not exposed back when most sites were
single hosted. That may well be the case for the vast volume
of sites. Now, with very large volume sites using multi-hosted
site servers the problem that I discuss above has turned out
to be a show stopper, It is my opinion that something like
the above recommendation will provide a better overall product
and will in-fact be a simpler implementation.
Best Regards
chris
-- PS. Setting positive_dns_ttl to non zero causes squid to "fixate" on a single address for any given site. To see this for yourself run tcpdump or another packet analyzer and trace accesses to the site listed above. Note that squid never uses any addresses other than the first returned by an initial DNS query. You may draw your own conclusions from this.Received on Wed Aug 21 1996 - 00:59:20 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:32:49 MST