Sorry to top-post. Any more ideas with this?
- Grant
On Fri, Jul 5, 2013 at 12:33 AM, Grant <emailgrant_at_gmail.com> wrote:
>>> I updated to 3.3.6 and the system doesn't become totally unresponsive
>>> any more, although SSH latency is still pretty high when the client is
>>> trying to load a page. The browser still hangs after loading a few
>>> page elements on some websites (www.google.com/nexus/) but now if I
>>> let the page load for long enough, it does eventually load, but it can
>>> take 5 minutes or longer. Restarting squid sometimes makes it load a
>>> lot faster. It's possible that it hangs more often when loading
>>> elements from a different domain or subdomain (services.google.com,
>>> doubleclick.net) but that could be a coincidence. The client's and
>>> server's internet connections are strong.
>>
>> If that were related it might be DNS or TCP congestion (ECT, Window Scaling,
>> MTU) issues.
>
> I set the following on the squid server and client with no noticeable change:
>
> echo 0 > /proc/sys/net/ipv4/tcp_ecn
> echo 1 > /proc/sys/net/ipv4/ip_no_pmtu_disc
> echo 0 > /proc/sys/net/ipv4/tcp_window_scaling
>
>>>> The usual cause of these type of issues is forwarding loops, although
>>>> your
>>>> low of socket usage indicates that is probably not the problem.
>>>
>>> Yes, I'm the only user.
>>
>> It might be related to the 10ms select-loop delays in Squid. If you load the
>> proxy with a bunch more requests (say 20 in parallel constantly) does it
>> still happen?
>
> I opened 20 tabs in firefox and the 3 tabs which started loading first
> loaded slightly more content than usual.
>
>>> I have this on the squid system while the browser seems to hang so I
>>> think there is plenty of available physical RAM:
>>>
>>> # free
>>> total used free shared buffers cached
>>> Mem: 1985944 1638368 347576 0 838340 219332
>>> -/+ buffers/cache: 580696 1405248
>>> Swap: 1048572 0 1048572
>>>
>>> 2013/07/04 08:51:04.143 kid1| event.cc(250) checkEvents: checkEvents
>>> 2013/07/04 08:51:04.143 kid1| AsyncCall.cc(18) AsyncCall: The
>>> AsyncCall MaintainSwapSpace constructed, this=0xbb3310 [call546]
>>> 2013/07/04 08:51:04.143 kid1| AsyncCall.cc(85) ScheduleCall:
>>> event.cc(259) will call MaintainSwapSpace() [call546]
>>
>> These happening at regular but widely separated intervals? or lots across
>> the slowdown period?
>
> They happen about once per second during the slowdown.
>
>> This is the main cache garbage collection operations, so should be checked
>> and purge some things every so often. If they happen unuslally frequently
>> during the slow-down period it means the cache is overflowing and CPU is
>> busy purging contents until enough space is available for the new traffic.
>
> squid CPU usage is very low during the slowdown at .5% - 2.5% with
> most of the CPU idle. I get 18M cache size every time I check:
>
> # du -sh /var/cache/squid
> 18M /var/cache/squid
>
> I've tried each of these with no noticeable change:
>
> cache_dir ufs /var/cache/squid 100 16 256
> cache_dir aufs /var/cache/squid 100 16 256
> cache_dir diskd /var/cache/squid 100 16 256
>
> - Grant
Received on Fri Jul 12 2013 - 06:24:56 MDT
This archive was generated by hypermail 2.2.0 : Fri Jul 12 2013 - 12:00:12 MDT