Re: false hit recovery?

From: Alex Rousskov <rousskov@dont-contact.us>
Date: Tue, 24 Nov 1998 11:35:37 -0700 (MST)

On Tue, 24 Nov 1998, Henrik Nordstrom wrote:

> Is there anyone working on false hit recovery?

Yes.
 
> forward.c should then cycle throught the list of addresses, and fail
> when the complete list has been tried two or three times.
>
> Perhaps there should be a limit on the number of HIT siblings to try.
> ...
> 2-3 is
> probably a reasonable limit on the number of HIT siblings to probe. Note
> that this restriction only makes sense on siblings and not parents
> and/or origin site.

I do not thing it is the best algorithm for several reasons:

        - in many cases it does not make sense to try several times
          (if the object is not there, it is unlikely to appear)
        - users do not care about #attempts, they care about time
        - parents and origin servers are special

I am not working on this algorithm myself, but that is what I would suggest
as a tentative plan.

        - form a list of servers to try
        - note start time
        - success = false
        - while (!success && there-are-servers-left) {
                ~ if (time-passed >= opt.retry_timeout)
                        select next "guaranteed" server
                  else
                        select next server
                ~ success = try selected server
          }
        - if !success report ERR and maybe list servers tried (in the
          headers?)

"guaranteed" server is the server that must handle misses from us OR the
only-server-left if the server list contains only one member. Origin servers
and parents are examples of "guaranteed" servers.

I am not sure if we should STOP trying if there are guaranteed servers left,
but (we ran out of time and/or we tried at least one guaranteed server).
Doing so may result in more ERR messages propagated to the user.

Note that a time limit almost solves the problem when several caches in a
chain use this algorithm (whether with max number of retries a user may end
up waiting for several caches trying 2-3 peers, resulting in huge total
delays).

It would be useful to forward an X-Already-Tried: header to peers so they do
not re-try some caches we have already checked out. "Some" because they may
still try caches that we had as siblings and they have as parents. Thus,
X-Already-Tried-Hitonly and X-Already-Tried-Missallowed may be in order. Ick.

Clearly the scheme can be extended/improved to handle errors other than false
hits. For example, access denied errors and connection errors.
 
Alex.
Received on Tue Jul 29 2003 - 13:15:54 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:58 MST