You could write a little robot that loads URLs as they appear in the Squid
access log file and follow the links from that page. If the robot uses
Squid as a proxy, you would only pull new stuff from the Internet. Of
course you need to make sure that the robot does not use its own trace in
the Squid access log file - you would get a pretty busy proxy :-). If you
use Squid's native log format with mime type information, your robot only
needs to inspect text/html objects.
I suspect that this idea could get you in trouble if your load is high ...
would be interesting to find out, though.
Carlos
On Thu, 20 Feb 1997, Kenny Elliott wrote:
>
> I have something I'd like to do with squid and I thought someone be able
> to tell me if it's possible and how to get started. What I would like to
> do is once someone has requested a page through the cache have squid
> follow the links on that page by itself (I think going one level would be
> fine. I think that this would give me a pretty good chance of already
> having in the cache where the user is going next. Suggestions?
>
> Thanks in advance.
>
> \/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/
> Kenny Elliott kenny@wild.net
> System Administrator http://www.wild.net/~kenny
> Wild.Net L.L.C. 504-875-9453
> \/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/*\/
>
>
>
Received on Thu Feb 20 1997 - 11:06:02 MST
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:34:30 MST