Scott Penrose

ModPerl Filters

Scott is an expert software developer with over 30 years experience, specialising in education, automation and remote data.

Writing mod_perl filters for Apache2

Apache 2 as a Pipeline

  • Apache 2 has many stages
  • It can be thought of like a pipeline

Handler - Server life cycle

Server startup, stop, and log handling

  • PerlOpenLogsHandler
  • PerlPostConfigHandler
  • PerlChildInitHandler
  • PerlChildExitHandler

Handler - Protocols

Lets say you want to write SMTP... See Apache2::Protocol::EMSTP

  • PerlPreConnectionHandler
  • PerlProcessConnectionHandler

Handler - Fllters

Filters... In, Out and all about

  • PerlInputFilterHandler
  • PerlOutputFilterHandler

Handler - HTTP Handlers

This set is specific to HTTP Handlers.

  • PerlPostReadRequestHandler - post headers, e.g. Reload
  • PerlTransHandler - change the URI
  • PerlMapToStorageHandler - Map to file storage
  • PerlHeaderParserHandler - change response handler
  • PerlAccessHandler - Access restrictions not based on user (e.g. Time of day)

Handler - HTTP Handlers

HTTP continued...

  • PerlAuthenHandler - Verify a users credentials
  • PerlAuthzHandler - Is this user allowed ?
  • PerlTypeHandler - Check mime types, or language
  • PerlFixupHandler - Just before response handler e.g. Add ENV variables
  • PerlResponseHandler - Provide the actual response
  • PerlLogHandler - Log results
  • PerlCleanupHandler - Fake handler for Perl only (not part of Apache)

My First Filter

A basic filter to strip CR and LF from HTML. Start with the usual setup.

#!Perl
package MyApache2::FilterObfuscate;
use strict;
use warnings;
use Apache2::Filter ();
use Apache2::RequestRec ();
use APR::Table ();
use Apache2::Const -compile => qw(OK DECLINED);

My First Filter

#!Perl
sub handler {
	my $f = shift;

	unless ($f->ctx) {
		$f->r->headers_out->unset('Content-Length');
		$f->ctx(1);
	}

	while ($f->read(my $buffer, 1024)) {
		$buffer =~ s/[\r\n]//g;
		$f->print($buffer);
	}

	return Apache2::Const::OK;
}
1;

My First Filter

Restrict your filter to html files only. This is a good way to allow non HTML file. E.g. CSS or XML.

NOTE: This could break those files.

<Files *.html>
	PerlOutputFilterHandler MyApache2::FilterObfuscate
</Files>

My First Filter

Another way to restrict...

#!Perl
...
        my $ctype = $r->content_type;
        $ctype =~ s{;.*}{};
        unless ($ctype eq "text/html") {
                $log->info( "skipping request to ", $r->uri, " (not a text/html)" );
                return DECLINED;
        }
...

What do you mean only once...

  • Apache filters are called more than once per request...
  • Done with blocks of data - called a Bucket Bregade
  • Very useful for single character substitutions (like above) and Compression
  • Not so good for things like XML parsing.

So how do we buffer

"$f->ctx" represents a data storage point.

#!Perl
	# Get wht we have now
        my $ctx = $f->ctx;
        while ($f->read(my $buffer, 4096)) {
                $ctx .= $buffer;
        }

	# Seen the end? Not yet, Save CTX and return OK
        unless ($f->seen_eos) {
                $f->ctx( $ctx );
                return OK
        }

So how do we buffer

Finally process what you have, output it and return OK

#!Perl
        if ($ctx) {
                $ctx =~ s|OSDC2007|OSDC2008|g;
                $r->headers_out->unset( 'Content-Length' );
                $f->print( $ctx );
        }
	return OK;

The better way

In this case it would be better to only store what is not translated. So you only need to keep, in this case, up to 7 characters - in case they match O or OS or OSD or OSDC ...

Lack of understanding...

  • Apache filters are the best part of mod_perl2
  • Yet rarely used
  • Only very few modules
  • Little understanding on the mailing lists...
  • Investigate and enjoy them...
  • http://perl.apache.org/

Talk History

  • Talk
  • Perl