Marc van Selm wrote:
> IE probably changes this in %20 (or something). I'm not sure if
> that is legal though.
A browser is free to encode any character it likes using %nn syntax,
except "reserved" characters (/;?@&=+).. A browser is also allowed
to unescape escaped characters if it likes to (againg with the exception
of reserved characters). A browser is recommended not to alter the URL.
# is a special case and is allowed at most once in a URL to delimit
which fragment of the resource the URL refers to. All other uses of #
has to be encoded.
RFC 2068 forbids a HTTP application from sending URLs with spaces or
other unsafe characters in them using HTTP. Reserved characters has to
be encoded except when used for special purpuse in the URL sheme
(i.e. / does not need to be encoded when it delimits directories).
RFC 1738 says:
URLs are written only with the graphic printable characters of the
US-ASCII coded character set. The octets 80-FF hexadecimal are not
used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent
control characters; these must be encoded.
Which I read as a URL with unencoded space characters is not an URL, so
all uses of unencoded spaces in URLs results in undefined behaviour (or
should it be a error?).
RFC 2068 says:
In requests that they forward, proxies MUST NOT rewrite the
"abs_path" part of a Request-URI in any way except as noted above to
replace a null abs_path with "*", no matter what the proxy does in
its internal implementation.
Note: The "no rewrite" rule prevents the proxy from changing the
meaning of the request when the origin server is improperly using a
non-reserved URL character for a reserved purpose. Implementers
should be aware that some pre-HTTP/1.1 proxies have been known to
rewrite the Request-URI.
--- Henrik Nordstrom Spare time Squid hackerReceived on Thu Nov 19 1998 - 15:32:11 MST
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:43:10 MST