Today I had my first successful run of git-p4raw, a program I have been working on recently that imports a Perforce repository from its raw back-end data files.
The program is actually quite simple, and its approach was inspired by a conversation between myself and Gurusamy Sarathy. Perforce keeps these "checkpoint" and "journal" files which have inside them all of the metadata that it tracks. The file content itself is stored in RCS files which are otherwise uninteresting. So, the program loads this data into tables, throws a few constraints on the tables to confirm my suspicions of how the schema works, and then sets about mining changesets from the tables. As there is MD5 information in the database (for most files, anyway), integrity can be assured along the way and as a result I am very confident that this will be the cleanest conversion yet.
I still have some work to do:
I look forward to announcing the completion of this work! It's been a long, hard slog! :)