proxies for filtering (de-animating, de-Java-ing, ...)

proxies for filtering (de-animating, de-Java-ing, ...)

Post by Kyler Lai » Fri, 17 Jan 1997 04:00:00



This was brought up here a long time as I recall.
I pursued it (using CERN's HTTPd), but never got
far.  Now that Apache has a proxy module, I
wonder if anyone is using it for content filtering.

Specifically, I'd like to build something that will
get rid of animated GIFs, Java applets, Javascript,
backgrounds, color changes, and, perhaps, frames.
(It sure would be nice to get rid of large blocks of
<Hx>/<strong>, too...)

Most of these should be easy - I've even hacked the
binaries of browsers to disable some.  The animated
GIFs present a bit more of a challenge, though.  I
will need to read them in and interpret them a bit.
I plan to look into the format; hopefully it will
be as simple as truncating the file when the second
frame begins.

So anyone already started on this?

--kyler

 
 
 

proxies for filtering (de-animating, de-Java-ing, ...)

Post by Kyler Lai » Fri, 17 Jan 1997 04:00:00


O.k...I played with it some more.  It turns out to be
fairly easy to modify mod_proxy to get at the body
data via send_fb().  I added an argument to send_fb()
to communicate the content-type since I only want to
tweak image/gif and text/html (for now).

I poked around with the GIF89A specs and discovered
which byte to clear to turn off looping.  Now I just
need to parse the GIF file reliably and I have the
de-animation working.

Now I need a quick, small, 'C' parser for HTML to
embed in mod_proxy so that I can destory attributes
in various tags.  Suggestions?

--kyler

 
 
 

proxies for filtering (de-animating, de-Java-ing, ...)

Post by Kyler Lai » Sun, 19 Jan 1997 04:00:00


I had some requests for code, so I thought I'd release
the perfectly dreadful hack that I came up for chopping
down animated GIFs.

See
        http://www.ecn.purdue.edu/~laird/test/animated_gifs/deanimate.cgi/new...
for and example of a deanimated animated GIF.  The source
is all there, but don't ask me to help you with it.

BTW, I put in the "LOCK_NETSCAPE" definition just in
case you feel like messing with Netscape users.  I've
only experienced it with Navigator Gold 3.0 under
Solaris 2.5.

Hopefully I can clean up the code and make it a bit more
efficient, and then drop it into Apache's proxy module.

The next task will be yanking Netscapisms and other junk
out of HTML.  That's not going to be nearly as easy to
do well.  Any help will be appreciated.

--kyler

 
 
 

proxies for filtering (de-animating, de-Java-ing, ...)

Post by Ronald Floren » Sun, 19 Jan 1997 04:00:00


Quote:Kyler Laird writes:

   I had some requests for code, so I thought I'd release
   the perfectly dreadful hack that I came up for chopping
   down animated GIFs.

This is interesting material.  I wonder if anyone has pointers to
useful examples of Apache mod_rewrite directives (without extensive
code hacks) to filter obnoxious material like java or animated gifs.

Thanks for pointers or suggestions.
--

Ronald Florence                 Maple Lawn Farm, Stonington, CT

 
 
 

proxies for filtering (de-animating, de-Java-ing, ...)

Post by Kyler Lai » Sun, 19 Jan 1997 04:00:00



>   I had some requests for code, so I thought I'd release
>   the perfectly dreadful hack that I came up for chopping
>   down animated GIFs.
>This is interesting material.  I wonder if anyone has pointers to
>useful examples of Apache mod_rewrite directives (without extensive
>code hacks) to filter obnoxious material like java or animated gifs.

Hmmmm...how would mod_rewrite be useful?  By the time Apache
knows what the type of content is, it's out of mod_rewrite's
control.

mod_proxy makes it trivial, though.  I'll try to spend some
time dropping my deanimator into mod_proxy today.

--kyler

 
 
 

proxies for filtering (de-animating, de-Java-ing, ...)

Post by Kyler Lai » Sun, 19 Jan 1997 04:00:00


Yea!  I've got it working!

I kludged Apache's (old?) mod_proxy to deanimate GIFs as
they pass through.  It works beautifully.  I started
working on this again recently because MapQuest's dreadful
new interface has at least two animated GIFs in it.  Since
the new interface doesn't work well with XMosaic, I have
to use Netscape (or WebExplorer) when I want to use it,
but the animation is soooooo annoying.

Now it's gone!  *poof*!  There's a bunch of work going on
behind the scenes (and the debugging output shows it),
but all I had to do was point my browser at my new proxy
server and the animations are gone.

BTW, the animations aren't just switched to a single
iteration; they're chopped immediately after the first
frame.  That should help download speeds when I move the
proxy server closer to the 'net (off my desk...).

For those of you following this, see the code at
 http://www.ecn.purdue.edu/~laird/Apache/modules/mod_proxyfilter/
The only promise I make about the code is that it's ugly.

Now if I can effectively filter HTML, I might just become
a Netscape user...

--kyler

 
 
 

proxies for filtering (de-animating, de-Java-ing, ...)

Post by Ralf S. Engelscha » Mon, 20 Jan 1997 04:00:00



> Kyler Laird writes:
>    I had some requests for code, so I thought I'd release
>    the perfectly dreadful hack that I came up for chopping
>    down animated GIFs.
> This is interesting material.  I wonder if anyone has pointers to
> useful examples of Apache mod_rewrite directives (without extensive
> code hacks) to filter obnoxious material like java or animated gifs.

Hmmm... mod_rewrite cannot help you with the problem of content filtering
because it is just a URL rewriting module. It cannot output any contents.  So
mod_rewrite is totally unrelated to this situation. Ok, you can filter out
some URLs totally, but this not very useful, I think.

BTW: Beside this situation there _ARE_ some useful examples
     of mod_rewrite rulesets. Have a look at
     http://www.engelschall.com/sw/mod_rewrite/doc/solutions/

Greetings,
                                        Ralf S. Engelschall

                                        http://www.engelschall.com/

 
 
 

proxies for filtering (de-animating, de-Java-ing, ...)

Post by Kyler Lai » Tue, 21 Jan 1997 04:00:00



Quote:>Hmmm... mod_rewrite cannot help you with the problem of content filtering
>because it is just a URL rewriting module. It cannot output any contents.

Agreed, but...it *could* be used to help proxy filtering
based on URL's (in addition to content - handled by the
mod_proxy module).

I think I'd prefer a more specialized module for what I
have in mind, though.  I want something that will do
intelligent things with requests for advertising and
other annoying (counter) graphics.

I'm thinking that I'd have a list of sites (like
ad.doubleclick.net) and URL path names (".*/ie_.*",
".*/netnow.*", ".*count.*", ...) that would just return
tiny blank (local) graphics.

I suspect that if such a system is widely implemented,
advertisers and annoying people (with "hit" counters)
would eventually work around such countermeasures, but
I'm still predicting that there will be censoring
services springing up to help us.  Such services will
provide "kill files" for us to use.  They'll probably
even offer proxy services (for HTTP, NNTP, POP, ...).

--kyler