> Compression in itself could be slower than the worthwhile effort.
>
> Tokenization compression on HTML could however yield good results while
> maintaining a high speed through-put. eg: a token table of all the HTML
> tags (up to html 3.2 currently) including depreciated ones will change
> tags that can be anything up to say 20 bytes long into 2 or 3 bytes of
> information. Some type of 'word/text' compression could also yield good
> results while keeping up throughput.
I beg to differ. A standard LZ filter is faster just to pump everything
through than going to the bother of parsing tokens for an ad-hoc compression
scheme.
> Of course, this would only be good in large sites, because as mentioned,
> html is usually a small portion of the data.
>
> If you had a hardware compression card in the machine, then throughput
> would not be an issue and you could practically run everything through
> it.
In my experience decompressing has proven to add no significant overhead
in any case where the outout of the data is bottlenecked (eg feeding
down a comms line) and in cases like reading data from a floppy where
the medium it is stored on is slower than the system that uses it, it
is actually faster to read compressed data than uncompressed.
Which gives me an idea... when one squid sends a cached page to
another squid, it could send it compressed and cut down on bandwidth:
even if the comms already has hardware compression (eg a modem line)
it won't make it any worse - maybe better if squid uses a more
efficient compressor - and in the case of most T1s where compression
is still rarely used, it will be a clear win.
Of course, if the compressible data is only 6% of the total, the point
is moot. And I don't for a minute propose unpacking the graphics
files and recompressing them more efficiently. Yech.
G
Received on Mon Jun 09 1997 - 07:55:13 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:35:29 MST