Tracking user sessions

dws on 2002-09-20T20:45:16

To gain some insight into how people use one of my sites, I've hacked together a script that sorts web log entries into groups of "sessions", where a session is a time-bounded set of accesses from a given IP/host. This scheme works O.K., except for a handful of visitors who come in via AOL (and hence AOL's proxies). I can disambiguate a single user by lopping the proxy name off of the hostname, but that doesn't fly when sessions overlap. I've tried using the User-Agent to keep users sorted out, but that's not 100% reliable.

Much of the contact is static, so I can't rely on tracking via session cookie. (The site is hosted at an ISP, so I can't install mod_usertrack.)

How have other people dealt with this kind of after-the-fact tracking? Any pointers will be appreciated.


check links

gav on 2002-09-20T21:24:11

One option may be to spider the site to determine what pages link to one another. If a client requests two pages in close succession that aren't linked together then it's pretty unlikely that they are the same person.

Proxies

bart on 2002-09-21T15:48:38

This scheme works O.K., except for a handful of visitors who come in via AOL (and hence AOL's proxies).
I have to surf via a proxy too. But this proxy does provide an extra HTTP header: x-forwarded-for, the value being my own private IP address as assigned to my cabe modem. It results in an equivalent environment variable, "HTTP_X_FORWARDED_FOR", that can be read in CGI scripts.

I don't have any idea if AOL proxies do the same as the proxy I use, but it is worth investigating.