Hi
I've been running a Perl redirector script, which with a careful
analysis of the Squid access.log file has been able to help me filter
out URLs which are nothing more than banner GIF files, and then to
redirect the request to a locally served & stored GIF image (I use
several different URLs, though I should probably eventually stick to
one, easily changed one for all add sites). This is a valuable upstream
bandwidth saver for us, and also has the advantage of keeping these
files out of the cache (disk /memory requirements are smaller and we get
a better hit ratio).
For a small ISP, this could be a good cost saver, and a chance to put
your favourite motd (Message Of The Day) or similar onto some well hit
sites (as long as the viewing public don't then try to go to the
underlying URLs in the HTML (which could also be redirected I guess - I
just never bothered with it).
So, I've had a look on the net for a definitive list of blocked ad
sites. I found http://internet.junkbuster.com, but a search for their
recommended phrases on AltaVista revealed only one site
http://www.teclata.es/junkbuster/english/blocklist.html (better than a
kick in the head I guess). The only other sites which helped was one
from http://www.markwelch.com/bannerad/ who appears to be a banner ad
consultant (I have yet to go through his HTML pages and extract all the
URLs for inclusion into the add breaker)
Sure, the advertisers will keep changing their host names and ad
directories, but with a bit of collective effort, we can keep it
up-to-date. Maybe someone with an external Internet site can post a ad
banner URL regexp list that we can add to using a form (as long as the
advertisers can't remove them !!!) ?
Anyone want to swap/share URLs or redirector scripts on this one ? Here
is my redirector code, in case anyone wants it. It's pretty primitive,
just a whole lot of pattern matching and IFs. Heck, I didn't even quote
the "." characters in the regexps !
Regards
Jason
------------------8<---- snip ----8<-----------------------------
local# more redir.pl
#!/usr/local/bin/perl
$|=1;
$base = "http://alpha.my.net/ico/";
$no_ads = $base . "no_ads.gif\n";
$no_adds_dn = $base . "no_adds_dn.gif\n";
$no_adds_dblclick = $base . "no_adds_dblclick.gif\n";
$no_av_left = $base . "no_av_left.gif\n";
$no_av_right = $base . "no_av_right.gif\n";
$no_lycos_ads = $base . "no_lycos_ads.gif\n";
$no_focalink_ads = $base . "no_focalink_ads.gif\n";
$no_infoseek_ads = $base . "no_infoseek_ads.gif\n";
$no_aol_ads = $base . "no_aol_ads.gif\n";
$no_aol_mini_ads = $base . "no_aol_mini_ads.gif\n";
$no_infospace_adman = $base . "no_infospace_adman.gif\n";
$no_infospace_ads = $no_ads;
$no_yimg_ads = $no_ads;
$no_yahoo_ads = $no_ads;
$no_yahoo_promo = $base . "no_yahoo_promo.gif\n";
while (<>) {
# Start of ad removal process
m@dejanews.com/ads@ && do {$_ = $no_adds_dn; };
m@http://ad.doubleclick.net/@o && do {
# AltaVista adds
# There is a left and right badge for the AltaVista title line
if (m@altavista@o) {
if (m@left@o) {$_ = $no_av_left; }
elsif (m@right@o) {$_ = $no_av_right; }
else {$_ = $no_adds_dblclick; }
}
};
# Lycos has its own ads server (for the moment anyhow)
if (m@http://ads.lycos.com/ads@o) {$_ = $no_lycos_ads; }
# Focalink use adds on a number of servers, all with the same
SmartBanner
# CGI program
if (m@focalink.com/SmartBanner@o) {$_ = $no_focalink_ads; }
# InfoSeek has its own ad directory too
if (m@http://www.infoseek.com/ads@o) {$_ = $no_infoseek_ads; }
# AOL Netfind has an ad redirector of sorts
if (m@http://ads.web.aol.com/@o) {
if (m@/image/@o) {
if (m@\?@o) {$_ = $no_aol_ads; }
else {$_ = $no_aol_mini_ads; }
}
elsif (m@/content/@o) {$_ = $no_aol_ads; }
else {$_ = $no_aol_ads; }
}
if (m@http://ads.infospace.com/adman@o) {$_ = $no_infospace_adman;
}
if (m@http://ads.infospace.com/adredir@o) {$_ =
$no_infospace_adman; }
# Infospace use two servers called pic1 and pic2, but sometimes their IP
address
if ((m@http://199.242.24@o) || (m@http://pic\d.infospace.com@o)) {
if (m@/ads/@o) {$_ = $no_infospace_ads; }
}
# Yimg ads have the format us.yimg.com/a or /adv
if (m@http://us.yimg.com/a@o) {$_ = $no_yimg_ads; }
# Yahoo ads have yahoo.com/a or yahoo.com/adv/ URLs
# or with appropriate country suffix in domain name
if
(m@\.yahoo\.(com|com\.au|ca|fe|de|no|se|co\.uk|com\.sg)/(a|adv)/@o)
{$_ = $no_yahoo_ads; }
if
(m@\.yahoo\.(com|com\.au|ca|fe|de|no|se|co\.uk|com\.sg)/promotions/@o)
{$_ = $no_yahoo_promo; }
# End of ad removal process
print;
}
Received on Thu Mar 12 1998 - 14:48:15 MST
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:39:21 MST