Thread::Sociable is progressing well. All the scalar, array, and hash basics work, and performance is about double threads::shared for equivalent operations up to about 4 threads (on a uniprocessor Win32 machine). At 8 threads, things start getting interesting, and 16+ threads really expose the threads::shared global lock penalty. The coming weekend's multiprocessor tests should provide even more interesting benchmark results.
Locking tends to equalize performance for a single shared/sociable operation inside the lock scope...but the perfomance delta rapidly improves with each additional operation inside the lock.
Once the embedded queueing implementation is complete, and Thread::Apartment is updated, apartment threaded method calls should be much snappier.
The addition of a tie-like capability (required to complete Tk::Threaded) will enable some intriguing applications: when thread A writes to a sociably tied scalar, threads B, C, D, etc. can all fire their own STORE() methods. What a multithreaded FETCH() means remains a bit of a puzzler (some sort of sequencing interface is needed so thread A can decide which of thread B, C, and D's FETCH() ultimately gets applied ...)
Some additional observations: