Thinking of tracking build artifacts in git

jdavidb on 2008-05-13T16:03:01

My process and thinking are seriously slowed down by the long build times I'm dealing with. It's particularly frustrating when I realize I'm rebuilding the exact same stuff I built earlier just to check something out before reinserting the same changes with yet another twist.

In my process for using git, I specifically used .gitignore to ignore the outputs of our compilation and build process: classfiles, jars, irrelevant copies of stuff that get produced, extra classfiles Eclipse produces during development, and who knows what all other kind of garbage. I could remove .gitignore complete and suddenly it would see all of these files. I'm not sure if that's a good idea or a bad idea.

On the one hand, it would make life a lot simpler for me if the latest build on each branch were always available for me to revert to. On the other hand, it seems like I'll be storing a monstrous amount of redundant generated files. I guess I really don't care about storage space at this point. And maybe they can be vacuumed later. (In fact, they will be by virtue of the fact that all branches eventually get committed to CVS and deleted -- a garbage collection process after that would remove all the builds along the way.)

I guess I'm worried it would complicate my interaction with git. I'm used to just committing source to version control. Do I commit after a source change and then commit again after build, knowing that the source commit will have a previous build sitting there? Do I require my build to complete before I can ever commit? That would slow me down in a different way, I think.

Thinking out loud. :)


Content-addressable

Aristotle on 2008-05-15T22:20:55

Note that git stores files based on their SHA-1 fingerprint. No matter how many copies of a particular file you have, and no matter when and how they were created, all of them together will only take up space once. So unless *all* of your artefacts change *all* the time, it should be fairly cheap to keep them around.

Re:Content-addressable

jdavidb on 2008-05-16T13:42:18

A lot of the objects are jars which contain timestamps. :(

Workflow

Aristotle on 2008-05-15T22:23:07

I think I would try keeping the artefacts in a different branch, putting a different .gitignore in each branch. When you want to build, you check out the build branch, merge the source branch, build, and commit. To resume developing, you check out the source branch again.

I haven’t thought through all the consequences, but I think that would be a fairly practicable way of keeping binaries around without mixing the build-related history and the source-edit-related history into a giant mishmash. It also makes it easy to ditch all the build-related history: just kill that branch.

Re:Workflow

jdavidb on 2008-05-16T13:41:50

Thanks for the suggestion! I will add this to my musings over the next few days. :)