Kosmos Distributed File System

acme on 2007-10-02T06:43:53

I probably should have blogged about this before, but I've been very impressed so far with the Kosmos Distributed File System. Read the blurb and it'll ring a bell as you'll find out that it's a pretty close reimplementation of the the Google File System (first published about in 2003!).

Currently the single point of failure is the one meta-data server, but the GFS authors argued a single master is a great feature, but they keep live backup masters ready to go. I'm sure this will be fixed soon in KFS. From the GFS paper:

Having a single master vastly simplifies our design and enables the master to make sophisticated chunk placement and replication decisions using global knowledge. ... The master state is replicated for reliability. Its operation log and checkpoints are replicated on multiple machines.

Most compaints so far are along the lines of "It doesn't work with Hadoop", which is silly, because it does.

How does this compare to MogileFS? They are quite separate: MogileFS is for live serving of small files, and KFS is more for offline storing of large logs.

It's only the first release of KFS. Sriram goes into a little detail on its current functionality and semantics.

The announcement had a throwaway mention of Hypertable:

Hypertable: Hypertable is an open source project (being developed at Zvents Inc.) that provides a Big-Table interface. KFS is integrated with Hypertable as the backing store.

I can't find much about this, but Ethan mentions "It will be available under an open-source license within the next 60 days". Sounds very interesting...