To gain some insight into how people use one of my sites, I've hacked together a script that sorts web log entries into groups of "sessions", where a session is a time-bounded set of accesses from a given IP/host. This scheme works O.K., except for a handful of visitors who come in via AOL (and hence AOL's proxies). I can disambiguate a single user by lopping the proxy name off of the hostname, but that doesn't fly when sessions overlap. I've tried using the User-Agent to keep users sorted out, but that's not 100% reliable.
Much of the contact is static, so I can't rely on tracking via session cookie. (The site is hosted at an ISP, so I can't install mod_usertrack.)
How have other people dealt with this kind of after-the-fact tracking? Any pointers will be appreciated.
I have to surf via a proxy too. But this proxy does provide an extra HTTP header: x-forwarded-for, the value being my own private IP address as assigned to my cabe modem. It results in an equivalent environment variable, "HTTP_X_FORWARDED_FOR", that can be read in CGI scripts.This scheme works O.K., except for a handful of visitors who come in via AOL (and hence AOL's proxies).
I don't have any idea if AOL proxies do the same as the proxy I use, but it is worth investigating.