Back in February Microsoft released the Microsoft Office Binary File Formats with much fanfare.
In the case of the Excel file format, which I am interested in, the February specs offered little information over what was already in the public domain (for loose definitions of public).
Then less than a month ago, to much less or perhaps no fanfare, Microsoft released a second round of file format specifications, the June specs: "Microsoft Takes Additional Steps in Implementing Interoperability Principles". Surprisingly these were much more detailed. The newer Excel specification, for example, is 1100 pages as compared to 350 pages in the previous one although the information is much more complete than a mere page count indicates. It is also cross-linked within itself and with supporting documents and all in all it feels more like a real specification than the February doc.
Which is great. As Joel Spolsky pointed out, in relation to the first round of docs, having a spec doesn't mean that it is easy to deal with these particular file formats. However, having such a detailed document certainly helps.
Which leads to the question, why didn't Microsoft release the detailed specs in the first place.
Anyway, since this is a Perl blog I should add by way of technical content that although Perl is often seen as a good text processing language it is also very flexible at processing binary data thanks to the pack()/unpack() functions. And binary data processing in Perl is much more portable than C solutions which suffer from more loosely defined data sizes and differently padded structs.
John.
--