On Friday 05 September 2003 10:21 pm, Christoph Haas wrote:
> On Sat, Sep 06, 2003 at 02:22:00AM +1200, mdew wrote:
> > Using regex "/etc/squid.adservers" I'm attempting to block any URL's
> > with "penis" AND "large" in the url. Basically *penis*large* and
> > *large*penis* ..I was looking at doing like so..
> >
> > (/large/ && /penis/)
> > (/penis/ && /large/)
>
> See "man 7 regex". I would suggest something like:
> (large.*penis|penis.*large)
Beware of attempting this sort of thing without word boundaries. For
example, there is a town in the north of England called Penistone, and it's
not hard to find several URLs (eg in Google) which include the 5 letters
"penis" without being the sort of thing you're trying to block:
http://www.penistonereinforcements.com
I didn't bother to look for a URL which had "large" somewhere in it as well,
but it's not hard to imagine such a false positive existing.
Maybe you're happy to block a few false positive web pages in exchange for a
higher number of true positives, but it's a choice you should be aware you're
making.
Antony.
-- The only problem with the Universe as a platform, though, is that it is currently running someone else's program. - Ken Karakotsios, author of SimLifeReceived on Fri Sep 05 2003 - 15:31:53 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:19:33 MST