Hi people
I have been having a problem with the log analysis scripts.
They either:
Take too long to run
or
Don't work properly with siblings
or
don't work with large log files (the awk ones especially - out of ram messages)
I wrote some basic ones of my own last night.
We have 3 cache machines, all setup as siblings, one of which has a lot
of disk space, and the others all have a small amount of disk space.
Basically my objectives are:
One page report, including:
total (megs) downloaded from the cache
total (megs) that the cache downloaded
total hits to the cache
total hits that the cache missed on
Another thing I wanted to do is download the logs every day, so they don't
build up into huge files, and analyse them then. I would then write that
to a log file, so once a week I could print a report, yet only have 1 day's
logs on disk.
ftp://ftp.is.co.za/private/oskar/log-ana-oskar.tar
I am NOT SURE THAT THEY ANALYSE LOGS CORRECTLY! Could someone check this?
Basically my understanding is:
I have cache1, cache2 and cache3.
If I count all "TCP_HITS" for each of the caches, and ignore "SIBLING_HIT"
messages, I will count the total that the caches served from disk.
If I count all TCP_MISS messages, I count the stuff that had to be retrieved
from the original site (remember, if it's a SIBLING_HIT, it gets counted
when I analyse the next log, from the other cache machine?)
The best example is the source, of course.
To run it:
./simple.pl access.log.file title_for_this_cache > log
./report.pl < log
currently the "title_for_this_cache" stuff isn't used, but we can modify
the report script to tell us "give me a report on cache1 only, please".
Can someone check the logic in my script? It would be much appreciated.
Oskar
Received on Tue Jul 29 2003 - 13:15:41 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:18 MST