Been digging a bit deeper into the file corruption bug I and others (see bug #451) have been experiencing. After liberal use of 'squid -k debug', I came up with the following scenario for the corruption.
1) Client A makes request, squid retrieves, starts sending data to client (for ease of testing, a large request is preferred)
2) Client B makes request:
2002/10/10 11:30:37| comm_poll: FD 66 ready for reading
2002/10/10 11:30:37| clientReadRequest: FD 66: reading request...
2002/10/10 11:30:37| commSetSelect: FD 66 type 1
2002/10/10 11:30:37| parseHttpRequest: Method is 'GET'
2002/10/10 11:30:37| parseHttpRequest: URI is '/images/ddm3h.gif'
Squid retrieves client B request from disk, meanwhile, writes for client A continue:
2002/10/10 11:30:37| comm_poll: FD 273 ready for writing
2002/10/10 11:30:37| commHandleWrite: FD 273: off 0, sz 4096.
2002/10/10 11:30:37| commHandleWrite: write() returns 4096
2002/10/10 11:30:37| clientWriteComplete: FD 273, sz 4096, err 0, off 3469312, len -1
3) comm_poll runs again:
2002/10/10 11:30:37| comm_poll: 2+0 FDs ready
2002/10/10 11:30:37| comm_poll: FD 66 ready for reading
Odd...why is FD 66 ready for reading? Should be writing the data...
2002/10/10 11:30:37| clientReadRequest: FD 66: reading request...
2002/10/10 11:30:37| commSetSelect: FD 66 type 1
2002/10/10 11:30:37| clientReadRequest: FD 66: (104) Connection reset by peer
Ahh...received ECONNRESET from client! In that case, close up shop on this FD
comm_close(fd);
Of course, that comm_poll showed 2 FDs ready.
2002/10/10 11:30:37| comm_poll: FD 66 ready for writing
Hmm...that could be a problem - we just closed that FD. More on that later...
4) Now, go about business as usual on the request from Client A
2002/10/10 11:30:37| comm_poll: FD 273 ready for writing
2002/10/10 11:30:37| commHandleWrite: FD 273: off 0, sz 4096.
2002/10/10 11:30:37| commHandleWrite: write() returns 4096
2002/10/10 11:30:37| cbdataValid: 0x23570fe0
2002/10/10 11:30:37| clientWriteComplete: FD 273, sz 4096, err 0, off 3473408, len -1
However, the aborted request from Client B somehow shows up in the Client A download at approximately this offset.
It is at this point where I wish I had a better understanding of the code to be able to figure out where the memory is getting corrupted, but after spending a day or two on it, the best I can come up with is the attached patch, which I originally thought would solve the problem.
As illustrated above, the comm_poll shows the aborted FD is ready for read _and_ write, so I figured when it tried to perform the write to a closed FD it was causing the corruption. Unfortunately, while I still suspect my patch is the right thing to do, it does not solve the corruption.
So at this point, I submit this to you Squid programming gurus out there...please help!!!
-Phil Oester
--- squid-2.5.STABLE1-orig/src/comm_select.c Sat Apr 27 01:48:42 2002
+++ squid-2.5.STABLE1/src/comm_select.c Wed Oct 9 16:10:29 2002
@@ -452,7 +452,7 @@
comm_poll_http_incoming();
}
}
- if (revents & (POLLWRNORM | POLLOUT | POLLHUP | POLLERR)) {
+ if (F->flags.open && (revents & (POLLWRNORM | POLLOUT | POLLHUP | POLLERR))) {
debug(5, 5) ("comm_poll: FD %d ready for writing\n", fd);
if ((hdl = F->write_handler)) {
F->write_handler = NULL;
Received on Fri Oct 11 2002 - 14:30:22 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:10:40 MST