Re: ICAP connections under heavy loads

From: Alexander Komyagin <komyagin_at_altell.ru>
Date: Fri, 31 Aug 2012 19:01:08 +0400

On Thu, 2012-08-30 at 08:29 -0600, Alex Rousskov wrote:
> On 08/30/2012 02:47 AM, Alexander Komyagin wrote:
>
> > What makes me wonder is that HttpStateData async job is created in
> > forward.cc using httpStart(fwd) function:
> >
> > void
> > httpStart(FwdState *fwd)
> > {
> > debugs(11, 3, "httpStart: \"" <<
> > RequestMethodStr(fwd->request->method) << " " << fwd->entry->url() <<
> > "\"" );
> > AsyncJob::Start(new HttpStateData(fwd));
> > }
> >
> > And I do not see any way for the request forwarder to control that job!
>
> Yes, the relationship among client-side, FwdState, and server-sde is
> weird and wrong on many counts. With some minor exceptions, there is no
> direct control of the server-side from the client-side and vice versa.
>
> Client-side-driven termination of server-side jobs is usually triggered
> via StoreEntry aborts, but HttpState does not receive StoreEntry abort
> notifications (a design bug) -- it checks for them when something else
> happens. This is something we should improve.
>
>
> > For example, let's assume that our ICAP service successfully checked the
> > request but we are unable to connect to ICAP in order to check the reply
> > - just do NOP in Xaction openConnection() method if it was initiated by
> > HttpStateData.
> >
> > This way, after client timeout occurs, Squid properly detects it but
> > nothing is done to stop HttpStateData job.
>
> Which may be the desired outcome in some cases (e.g.,quick_abort caching
> and "I want to see everything" ICAP service).
>
>
> > In real world Xaction will use noteCommTimeout to handle this case and
> > destruct the job; but still it looks strange for me that Squid doesn't
> > abort HttpStateData job in case the client is gone.
>
> It is complicated because Squid has to balance the needs of each single
> client with the needs of ICAP services and other clients (including
> future ones if caching is enabled).
>
>
> >>> and meaninglessly switches icap status in minutes after the test.
> >>
> >> Why do you describe a "down" status of an overloaded ICAP service as
> >> "meaningless"? The status sounds appropriate to me! Again, I do not know
> >> much about httperf internals, but in a real world (or a corresponding
> >> Polygraph test), the ICAP service may be always overwhelmed if there is
> >> too much traffic so a "down" state would be warranted in many such cases
> >> (Squid, of course, cannot predict whether the service overload is
> >> temporary).
>
> > I think it's better to detect ICAP "down" state as earlier as we can,
> > and certainly not after minutes after the load peak. The thing is that
> > in that 6 minutes ICAP service becomes responsive again, and then Squid
> > eventually catches dozens of exceptions and turns ICAP down. And that's
> > is a mistake, because ICAP is OK now!
>
> Some of the up/down decision logic is configurable, and if you have
> suggestions on how to improve it, let's discuss them! The trick here is
> to maximize the number of common ICAP service failures that the
> algorithm can handle in a reasonable fashion, without adding 10 more
> configuration options.
>
>
> > Correct me if I'm wrong, but a proper way to fix the issue would be to
> > deny Xaction activity until the connection is fully established, since
> > the problem is not in the read/write operations but in the connection
> > opening.
>
> What do you mean by "deny Xaction activity"? I do not think current ICAP
> transactions do much while they wait for the connection to be
> established. They just accumulate virgin data and wait. What should they
> do instead?
>
>
> > This way Xaction object will detect connection fail after connection
> > timeout as it's supposed to be.
>
> If ICAP transactions do not detect connection timeout now, it is a bug
> that we should fix.
>

Alex, I figured it out, finally! The bug was in comm_connect_addr()
function (I suppose it is kernel-dependent though).

Consider following call trace:
1) Xaction starts ConnOpener in order to create new connection to ICAP
2) ConnOpener calls comm_connect_addr()
3) comm_connect_addr() initiates connection through connect() and
returns COMM_INPROGRESS
...since our ICAP service is too busy connection will eventually timeout
(connectTimeout_), so...
4) Comm::ConnOpener::timeout() is called
5) Comm::ConnOpener::timeout calls connect()
6) connect() calls comm_connect_addr()
7) comm_connect_addr() calls getsockopt(SOL_SOCKET, SO_ERROR) to check
current connection status
8) getsockopt(..) returns errno 0 <--- BUG
9) comm_connect_addr() returns COMM_OK so ConnOpener would think that
connection was successfully established and pass the connection back to
Xaction object, then we would have noteCommRead and noteCommWrote
exceptions and so on...

According to `man connect`, when connect() returns errno EINPROGRESS:
EINPROGRESS
        The socket is nonblocking and the connection cannot be completed
        immediately. It is possible to select(2) or poll(2) for
        completion by selecting the socket for writing. After select(2)
        indicates writability, use getsockopt(2) to read the SO_ERROR
        option at level SOL_SOCKET to determine whether connect()
        completed successfully (SO_ERROR is zero) or unsuccessfully
        (SO_ERROR is one of the usual error codes listed here,
        explaining the reason for the failure).

Actually ConnOpener uses SetSelect(..) in order to wait before calling
comm_connect_addr() (and connect() ) again, but timeout in fact breaks
the rule calling getsockopt() right after connect().

In my environment we use uClibc and rsbac kernel, but I suppose that the
problem may persist in other environment too.

I've attached two small patches. One - to avoid connection loss in case
client aborted the request before connection to the ICAP was established
(remember that "BUG: Orphaned..." stuff I asked you before?).
Another one is to actually fix the ICAP connection timeout (not only
ICAP actually) problem. Comments will be appreciated.

HTH, Alexander.
>
> Thank you,
>
> Alex.
>

-- 
Best wishes,
Alexander Komyagin


Received on Fri Aug 31 2012 - 15:05:10 MDT

This archive was generated by hypermail 2.2.0 : Fri Aug 31 2012 - 12:00:14 MDT