Tuesday, April 10, 2012

Apache, mod_perl, and an example output filter

Perhaps I should explain what an output filter really is.  Before apache serves any content to the web browser, it filters it, or allows different modules to alter it.  Here is an example situation :
You have been given a task where you have HTML pages that contain information for both the authenticated user and the user that hasn't authenticated.  You need to prevent the information intended for the user isn't served when a visitor hasn't authenticated.
How do you implement the above?  It's called an output filter.  For the above example, you will have to implement a simple module that checks a response page for a special tag, check for authentication, and if we are authenticated, allow the contents of the tag to be returned to the browser.  If not authenticated, remove the tag.  As an example, I'll use an HTML type tag :
<AuthenticatedOnly>
You are logged in!
</AuthenticatedOnly>
Please note, we aren't actually implementing the above.  It is merely a hypothetical situation.

There are two ways around creating an output filter.  You can create code that has to be compiled (my mod_template.so is a good example - it's written in C, and it wraps any HTML content in a template so that the whole site appears the same), or you can accomplish the same thing using mod_perl.  We'll take the latter one, as we won't need to install the compiler, just mod_perl.

To start out, we need to modify the Apache configuration to add it in.  Once you have it open, there are three things we need to do :

  1. Enable mod_perl, and insert an "internal" configuration to add our path.
  2. Create the filter module.
  3. Enable the configuration directive for the filter
To enable mod_perl :
LoadModule perl_module modules/mod_perl.so
PerlSwitches -Mlib=/opt/hacks/perl
With those two options in the Apache config, we also need to create our filter.  We'll create it in /opt/hacks/perl, obviously, since we told mod_perl to use that in the above example.


package MyOutputFilter;

use strict;
use warnings;

use base qw(Apache2::Filter);
use Apache2::RequestRec;
use APR::Brigade;
use Apache2::Const -compile => qw(OK);

sub handler : FilterRequestHandler {
  my $f = shift;
  my $bucketBrigade = shift;
  my $r = $f->r;
  my $uri = $r->uri();
  my $contentType = $r->headers_out->get("Content-Type");
  if ($contentType eq 'text/xml') {
    my $content = '';
    my $contentLength = $bucketBrigade->flatten($content);

    # modify $content here

    $f->r->headers_out->set('Content-Length', length $content);
    $f->print($content);
  }
  return Apache2::Const::OK;
}
This is a very simple filter - it does absolutely nothing, but should give you an idea of how a filter works. I'll break things down a little bit at a time. The first line is a good key - it's the name we use later when we enable our specific module. After the name, we have a few libraries we use, and then we must declare our handler function, since that is what mod_perl will call. Ours is the line :
sub handler : FilterRequestHandler {
The contents of that function define what we are doing.

Note that I have grabbed a "bucket brigade". A bucket brigade in apache is a pretty novel idea. Remember the stories about people putting out fires by passing buckets of water between each other? This is the same thing, but the "people" are apache modules, and the "buckets" aren't water, but response data.

In the above example, I call "flatten" on the bucket brigade - for another project, I needed to have ALL of the data at the same time, so it needed to be flattened into one "bucket".  If you work with large data, do NOT flatten them - you can cause Apache to use up free memory!  That assigned the contents of the response into the $content variable, which I can then modify.

After I am done with the data, I can pass it on to the next filter by calling $f->print($content);.  Note that I am also re-setting the Content-Length header - if your browser respects that, and you change the data but don't change the header, you might end up with blank pages (Google Chrome is a good example browser that does that).  Congrats!  You now have a simple filter that you can start to enhance to do some pretty amazing things!

No comments:

Post a Comment