Re: [PATCH] Bug 3420: Request body consumption races and !theConsumer

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Fri, 18 Nov 2011 16:15:41 +1300

On 18/11/2011 11:51 a.m., Alex Rousskov wrote:
> Hello,
>
> Closed bug 3190, open bug 3420, open bug 3417, and several
> unreported problems deal with stuck requests, early responses, aborted
> requests/responses and combination of those. The attached patch attempts
> to solve many of these mushrooming problems, but it is a significant
> change that may introduce other bugs. Please review.
>
> The patch is against v3.1 because most sufferers are running that
> version. I will port changes to trunk if approved. Technical details are
> copied below from the patch preamble.

Fine. The change itself appears very simple. How much testing has it
been through?
When this goes in it will be enough to push out a new 3.1 release, so we
have a week to get some solid production use feedback.

>
>
> Before these changes, the client side used a single "closing" state to
> handle two different error conditions:
>
> 1. We stopped receiving request body because of some error.
> 2. We stopped sending response because of some error.
>
> When a "directional" error occurred, we try to keep the transaction
> going in the other direction (e.g., to give ICAP the entire request or
> to give HTTP client the entire response). However, because there was
> just one "closing" state, the code failed to correctly detect or process
> many corner cases, resulting in stuck transactions and !theConsumer
> assertions/exceptions due to races between enableAutoConsumption() and
> expectNoConsumption() calls.
>
> This patch replaces the "closing" state with two direction-specific "we
> stopped sending/receiving" flags.
>
> Now, when the response sending code is done, it now checks whether the
> receiving code stopped and closes the connection as needed. This is done
> both when we encounter a sending error
> (ClientSocketContext::initiateClose) and when we successfully sent the
> entire response to the client (ClientSocketContext::keepaliveNextRequest).
>
> Similarly, when the request body reading code is done, it now checks
> whether the receiving code stopped and closes the connection as needed.
> This is done both when we encounter a receiving error
> (ConnStateData::noteBodyConsumerAborted) and when we successfully
> receive the entire request body from the client
> (ClientSocketContext::writeComplete).
>
> TODO: This patch focuses on various error cases. We might still have
> problems when there is an early HTTP response and no errors of any kind.
> I marked the corresponding old code with an XXX.

I'm not sure what you are thinking here. Early response with a
keep-alive should not be hitting the close / stopSending() code path
unless its an error. And early response with a Connection:close leaving
the clients side (because server-side pconn is irrelevant) can be closed
fully just fine after the write.
   Do we need any clean and purge logics to empty the pipeline without
errors being set on that early close case?

The edge case is likely to be
ClientSocketContext::keepaliveNextRequest() where pinning closes the
conn. That is one error case and could potentially lead to problems. I
suspect that it should be doing stopSending("pinned server gone") before
it does comm_close(). Do you agree?

For the commit. Please rename ClientSocketContext::initiateClose to
closeOnSendError() or something clearer. Or remove entirely. It is only
used a few times in client_side.cc

Amos
Received on Fri Nov 18 2011 - 03:15:58 MST

This archive was generated by hypermail 2.2.0 : Fri Nov 18 2011 - 12:00:07 MST