Re: FW UP: Squid-1.1.2 core dumping on Linux from Doug Urner on 1997-01-05 (squid-users)

From: Doug Urner <dlu@dont-contact.us>
Date: Sun, 05 Jan 1997 04:52:52 -0800

>I thought that it might be helpful if I followed up my original posting with
>more information;

I don't remember your original posting, so forgive me is some of this
has already been covered.

I'd encourage you to learn just enough about Unix programming to use a
debugger. Nothing fancy, just enough to get a backtrace. I don't
know linux, but I'd venture a guess that something like this would do
it:

        # gdb /usr/local/bin/squid /var/squid/cache/squid.core
        (gdb) bt
         .
         .
         .
         backtrace here
         .
         .
         .
        (gdb) quit

You could then send the backtrace to the list (and squid-bugs) and
someone more intimate with linux and programming might find a useful
clue.

>Firstly - I am running a Slackware 3.1 distribution of Linux with kernel
>2.0.27. This distribution has gcc 2.7.2, and libc-5.3.12. This proxy server
>server services relatively light load, our squid.conf was pretty much stock
>except acl's and other local stuff. The proxy server ran without a hickup
>for 6 days an then started coredumping and dying about twice a day. Cache
>size was approx 85MB (of 100MB) at this time. I saw the note about gcc 2.7.2
>& Solaris having problems with compiler optimisations, so just in case, I
>removed the -O's from the configure program and recompiled, but this didn't
>help. I would like to point out that I am not a Unix programmer, so I don't
>now where to go next.

- Are there any relevant mesages in cache.log? It may be worth
increasing the debug level (see squid.conf).

- Have you tried clearing the cache? It could be related to the size
of the on disk store.

- Are there linux configuration knobs to the limit the resources
available to a process? You may be running up against some of
these.

- Are you doing anything "interesting" in your squid.conf? You could
  be exercising a little used code path. Read the ChangeLog in the
  top directory of the squid source and see if anything stands out
  between 1.1.1 and 1.1.2.

>Of the many responses I got, many people were running the same Hardware/OS
>configuration as me with no problem (go figure!?). Many people reported
>having no problems with Squid-1.1.1, so I am using this currently - fingers
>crossed that it will continue to work as well as it is now.

It could be hardware (and just because 1.1.1 works doesn't mean that
its not hardware). This will be a hard one to track down without
access to a good programmer and/or spare hardware, so it is probably a
last resort. But keep it in mind that you may be having hardware
problems. It is amazing the number of times we run into this.

One easy thing you could do if you want to pursue hardware is to
enable ECC if your machine (and linux) support it.

>If anybody else has any suggestions (apart from hacking the code!) then I
>would like to hear it. It is interesting that nobody else who replied to my
>posting reported having similar experiences, so now I am curious...

This makes it sound like either hardware or configuration. The more
you can do to isolate and characterize the problem the more other
folks will be able to help. Things to do and questions to answer:

1) If squid is giving you core dumps (the core file will be in the top
   level cache directory), get a backtrace from each one. If it is
   always crashing in the same place this tends to point towards
   software (squid or gcc or your kernel or ???). If every crash is
   in a different spot it would tend to implicate hardware.

2) Check your cache.log for any relevant messages. You may want to
   increase the debug level so that you can see the transaction that
   is currently in process when squid dies (I assume that this is
   possible, but I don't know off hand what setting will get you this
   information).

3) Try to characterize the failure more concretely. How long does
   squid stay up? Is there any consistancy to this? Take a look at
   the access.log. Is there any pattern to the last couple of log
   entries? If you rename your cache_dir and restart squid
   (effectively clearing your cache) does squid become stable again?
   Is 1.1.1 really stable? How big is your squid process? It would
   be most useful to know this just before it dies.

Good luck!

Doug

--
Douglas L. Urner, dlu@bsdi.com, +1.503.231.4881

Received on Sun Jan 05 1997 - 05:03:40 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:33:59 MST