weak ETags

geoff on 2003-11-03T03:27:01

today, a patch to mod_include I submitted to new-httpd was applied to the 2.1 development branch. the patch fixes a bug that allowed mod_include to send ETag headers in 304 responses, which was improper because the initial request had the ETag header stripped.



while the patch fixes the problem with mod_include, it was really an off-shoot of an issue I came across while coding a more advanced and complete version of Apache::Clean. in that module, I have a filter_init routine that looks like this:



sub init : FilterInitHandler {

  my $f = shift;

  my $r = $f->r;



  if ($r->content_type =~ m!text/html!i) {

    # forbid the generation of ETags in the response
    # since we're altering content

    $r->notes->set('no-etag' => 1);

  }

return Apache::OK;
}




this code takes the same approach as the (new) mod_include - tell everyone that we plan to alter the content to the point that the Apache-generated ETag will be wrong. for mod_include, this is the proper approach, since SSI tags can include all sorts of things that could drastically change the content from request to request



however, what Apache::Clean essentially does is alter the HTML output without changing the meaning of the content - tags are simplified but even the look of the page in the browser remains unchanged. to me, this means that preventing the generation of an ETag header is excessive - RFC2616 specifically allows for the idea of weak validators, which signify instances when the bits of an entity have changed but the meaning has not.



my original post made mention of this and suggested that there be an API that would allow output filters to specify whether the (later) generated ETag header ought to be weakened rather than supressed. but like so many things, it was left uncommented on, probably for lack of tuits, understanding, or desire. still, it is an interesting topic to understand, even if there is no API around it.

but at least the mod_include bug is fixed now...