RPC::XML Bloat-be-Gone

rjray on 2003-01-27T22:57:39

I just released 0.50 of RPC::XML and no, you didn't miss any interim releases. I bumped the number up to reflect that the package has some very extensive new features, though almost all are behind-the-scenes changes that aren't obvious (indeed, they should be as inconspicuous as possible).

The main reason for this is that my day job is using this package in an application that sends Very Large messages containing base64-encoded audio. I don't know that I can say what it is at this stage, but it showed me that having the whole XML-RPC message in memory at once wasn't always a good thing. So this version addresses that on two fronts:

Incoming messages:
Messages coming into a class (for the server classes these are the requests from a client, for the client these are the responses from the server) are now sent in chunks to a non-blocking XML::Parser instance (using the parse_start method). In the case of servers, I could have passed the filehandle to the parse method, but I couldn't do this for the client (I don't have access to the filehandle within the confines of the LWP::UserAgent object). More importantly, my support for compression would have defeated that anyway. Using callbacks (for the client) or loops (for the servers) lets me handle both the compression and the arbitrary-sized messages.
Outbound messages:
Messages going out (requests sent by a client, or responses sent by a server) are a different matter. While I extended the data objects in the RPC::XML module to serialize themselves to a filehandle, I again ran into the issue of supporting compression, plus the lack of direct access to a socket-handle in LWP::UserAgent. To accomodate this, I've added options to the client and server classes to define a size-threshhold above which a message is first written to an anonymous temp file, then spooled from that file. This allows me to also support compression, though that requires writing to a second intermediary file, then copying from it to the primary one, compressing as I go. On the plus side, under Apache this allows me to use the send_fd method, which is extremely fast. And in the other cases, it's no more cumbersome than the callbacks for the incoming messages were. I had considered using tied filehandles to see if I could abstract all of this logic sufficiently, but I can't remember how complete the tied FH support was in 5.005, and I do actually have some users using this package under 5.005.

On another note, I've realized that I can't really make either of the Apache::RPC::Server or Apache::RPC::Status modules work under Apache2/mod_perl2 with the compatibility layer. I will have to (eventually) develop new classes entirely for use in MP2 environments. Sorry if this impacts anyone negatively.

For every complex problem there is a solution which is simple, neat and wrong. —H.L. Mencken

Coolio!

jjohn on 2003-01-27T23:30:16

Chunking the messages and writing them to disk seems like the way to go. I'll be interested to see how this works out for you over the long haul. Can you also chunk large payloads? For instance when you send a 2M MP3 over XML-RPC, the XML message is dwarfed by the MP3 payload. If the library could be smart enough to save 50K chunks and assemble them when the user asks for the payload, I'd say you've got a pretty durn slick library. ;-)

Re:Coolio!

rjray on 2003-01-28T22:24:44

Limitations of the underlying protocol, alas. In other words, this must remain compatible with XML-RPC as a whole.

That doesn't preclude writing an application that uses XML-RPC to send MP3 data in chunks, but the chunking has to be part of the application, not part of the protocol.

On that note, I will be putting support for chunked transfer-encoding into the library at some point. While this technically flies in the face of the spec, I think that it's both useful enough and important enough to include, regardless of how pissy it makes Dave (Winer).