Re: Inline content modification?

From: Joe Cooper <joe@dont-contact.us>
Date: Tue, 16 Jan 2001 17:12:35 -0600

Ok, now that I've dug in deeper, I see what you're talking about.

I've also realized that someone (Olaf Titz--the link is on the
squid.sourceforge page) has already done just what I need (and a lot
more) for Squid 2.3S2. So I'm forward porting it now and converting it
to the new cbdata stuff, and will start a new branch in CVS once it's
compilable. I'm going to write to Olaf to see if he has intentions to
maintain this work, and if not, I'll adopt it. It's much larger than
anything I had planned to do, but since it's already been done, I'd hate
to see it disappear in favor of a lesser implementation.

It has some weirdness in it that confuses me, wherein it has 'faked'
object orientation via preprocessor trickery. Since I haven't the
foggiest notion of object oriented programming or complex preprocessor
fun I'll have to wade my way through it to figure out what it's doing
and probably bring it back to plain ol' C. It also does dynamically
loadable modules which seems overly complicated and not very cross
platform (from comments in the code). But I'll leave it as is for now,
because loadable modules are cool. ;-)

It also operates on the client side rather than server side, which has
possible performance issues, as Robert pointed out. Then again, now
that I've actually seen a fully fleshed out design of this I realize
that client side allows better flexibility, and so I'll probably leave
it as is for now. The added flexibility is that Squid can decide which
URLs to modify based on the client as well as the source. Meaning
clients in the US could get unmodified URLs while Indian clients can get
modified URLs. Pretty neat and probably worth the CPU hit in some cases.

Thanks for the tips, Henrik. I'll be back with more concrete questions
as I go along, I'm sure.

Henrik Nordstrom wrote:

> Joe Cooper wrote:
>
>
>> Assuming it is possible, can I use the ACL interface to generate the
>> match lists, or do I need to come up with a method to handle the match
>> string and the replacement string? It would be nice to have a named ACL
>> for the match strings, and it seems reasonable that this would work. So
>> can I run /anything/, including whole html pages, through a regex or
>> string matching ACL? Anyone have pointers for how to tackle this one?
>
>
> You need to parse the HTML and only run the links thru a ACL list. But
> you are probably better of by defining some new kind of rewrite list
> using regex pattern and substitution pairs with back references, much
> like what you do in sed/perl. ACL's cannot rewrite data, only return
> "true/false".
>
> /Henrik
Received on Tue Jan 16 2001 - 16:04:45 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:13:19 MST