The GreedyDual family of replacement policies use a cost function to
identify which objects to evict from cache. In your description below you
are suggesting the use of bandwidth (throughput) to the origin server as
part of your cost function. If you can get a good estimate of this metric
(one that is stable over time) it can be used in your cache's replacement
policy. As you noted, a key challenge is that past performance is not always
a good indicator of future performance.
In your environment you may be able to make effective use of this type of
policy, more so than in other environments where bandwidth is not known in
advance. I encourage you to experiement with cost functions, particularly
taking into account whether retrieving objects from a peer (sibling or
parent). (A challenge there is knowing whether the peer will still have the
object the next time after you choose not to cache it...)
For further reference you should look at the work of Pei Cao (University of
Wisconsin) and the works she references. Also the Squid 2.3 release has a
parameterizable (heap based) replacement policy. It will be easy for you to
change the replacement policy by creating a new heap key generation
function -- once you come up with the function you wish to use. Best of
luck. Regards,
--jad--
----- Original Message -----
From: Jasper van Beusekom <van@cs.dal.ca>
To: <squid-dev@squid-cache.org>
Sent: Sunday, April 30, 2000 8:49 AM
Subject: ACL for throughput...
>
> I'm using squid in an enviroment where fast amounts of gigabit ethernet is
> available.
>
> An option that would be of great use to our enviroment would allows me to
> only cache objects that came in slow. Say an object from domain A or
> sibling B comes in at 300KB/sec and should not be cached so that I can
> use the cache space for the object from domain C or sibling D that only
> averages 1KB/sec. Perhaps such an option should then be combined with a
> replacement algorithm, where the throughput is a factor in deciding if an
> object should be replaced. The throughput level then can float like the
> age in LRU.
>
> One could take that a step further and record the average throughput in of
> request just like the ping response times in the net_db and use it to
> decide on the use of a sibling: if the origin host is faster then do not
> use the sibling. Further, averaging the throughput would prevent you from
> surprises on requests from networks which only sometimes are very
> congested.
>
> I see all kinds of problems with a configuration like this as there is no
> way of making an expectation about the throughput of a file due to the
> status of the network a few seconds from now.
>
> Jasper
>
>
> --
>
> Computer Science Helpdesk: +1 (902) 494 2593
> Fax/VoiceMail: +1 (877) 211 5401
>
>
Received on Sun Apr 30 2000 - 16:19:48 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:25 MST